Esempio n. 1
0
func (analyser ModuleAnalyser) digestText(text string) *word.Digester {
	digester := word.NewDigester(text)
	digester.RemoveOccurence(analyser.module.GetCode())
	result := word.NewGrammarFilter(digester).GetData()
	digester.SortByCounter("DESC")
	return result
}
Esempio n. 2
0
// AnalyseText removes the table from the list
func (controller *Analyse) AnalyseText() {
	description := "Computing professionals should not regard the computer as just a black box that executes programs by magic. Theknowledge area Architecture and Organization builds on Systems Fundamentals (SF) to develop a deeper understanding of the hardware environment upon which all computing is based, and the interface it provides to higher software layers. Students should acquire an understanding and appreciation of a computer system’s functional components, their characteristics, performance, and interactions, and, in particular, the challenge of harnessing parallelism to sustain performance improvements now and into the future. Students need to understand computer architecture to develop programs that can achieve high performance through a programmer’s awareness of parallelism and latency. In selecting a system to use, students should be able to understand the tradeoff among various components, such as CPU clock speed, cycles per instruction, memory size, and average memory access time. The learning outcomes specified for these topics correspond primarily to the core and are intended to support programs that elect to require only the minimum 16 hours of computer architecture of their students.  For programs that want to teach more than the minimum, the same AR topics can be treated at a more advanced level by implementing a two-course sequence.  For programs that want to cover the elective topics, those topics can be introduced within a two-course sequence and/or be treated in a more comprehensive way in a third course Algorithms and Complexity FAR - Architecture and Organization Algorithms are fundamental to computer science and software engineering. The real-world performance of any software system depends on: (1) the algorithms chosen and (2) the suitability and efficiency of the various layers of implementation. Good algorithm design is therefore crucial for the performance of all software systems. Moreover, the study of algorithms provides insight into the intrinsic nature of the problem as well as possible solution techniques independent of programming language, programming paradigm, computer hardware, or any other implementation aspect. An important part of computing is the ability to select algorithms appropriate to particular purposes and to apply them, recognizing the possibility that no suitable algorithm may exist. This facility relies on understanding the range of algorithms that address an important set of well-defined problems, recognizing their strengths and weaknesses, and their suitability in particular contexts. Efficiency is a pervasive theme throughout this area. Computational Science is a field of applied computer science, that is, the application of computer science to solve problems across a range of disciplines. In the book Introduction to Computational Science [3], the authors offer the following definition: “the field of computational science combines computer simulation, scientific visualization, mathematical modeling, computer programming and data structures, networking, database design, symbolic computation, and high performance computing with various disciplines.”  Computer science, which largely focuses on the theory, design, and implementation of algorithms for manipulating data and information, can trace its roots to the earliest devices used to assist people in computation over four thousand years ago. Various systems were created and used to calculate astronomical positions. Ada Lovelace’s programming achievement was intended to calculate Bernoulli numbers. In the late nineteenth century, mechanical calculators became available, and were immediately put to use by scientists. The needs of scientists and engineers for computation have long driven research and innovation in computing. As computers increase in their problem-solving power, computational science has grown in both breadth and importance. It is a discipline in its own right [2] and is considered to be “one of the five college majors on the rise [1].”  An amazing assortment of sub-fields have arisen under the umbrella of Computational Science, including computational biology, computational chemistry, computational mechanics, computational archeology, computational finance, computational sociology and computational forensics.Some fundamental concepts of computational science are germane to every computer scientist (e.g., modeling and simulation), and computational science topics are extremely valuable components of an undergraduate program in computer science. This area offers exposure to many valuable ideas and techniques, including precision of numerical representation, error analysis, numerical techniques, parallel architectures and algorithms, modeling and simulation, information visualization, software engineering, and optimization. Topics relevant to computational science include fundamental concepts in program construction (SDF/Fundamental Programming Concepts), algorithm design (SDF/Algorithms and Design),  program testing (SDF/Development Methods),  data representations (AR/Machine Representation of Data), and basic computer architecture (AR/Memory System Organization and Architecture).At the same time, students who take courses in this area have an opportunity to apply these techniques in a wide range of application areas, such asmolecular and fluid dynamics, celestial mechanics, economics, biology, geology, medicine, and social network analysis. Many of the techniques used in these areas require advanced mathematics such as calculus, differential equations, and linear algebra. The descriptions here assume that students have acquired the needed mathematical background elsewhere. Discrete structures are foundational material for computer science. By foundational we mean that relatively few computer scientists will be working primarily on discrete structures, but that many other areas of computerscience require the ability to work with concepts from discrete structures. Discrete structures include important material from such areas as set theory, logic, graph theory, and probability theory. The material in discrete structures is pervasive in the areas of data structures and algorithms but appears elsewhere in computer science as well. For example, an ability to create and understand a proof—either a formal symbolic proof or a less formal but still mathematically rigorous argument—is important in virtually every area of computer science, including (to name just a few) formal specification, verification, databases, and cryptography.  Graph theory concepts are used in networks, operating systems, and compilers. Set theory concepts are used in software engineering and in databases.  Probability theory is used in intelligent systems, networking, and a number of computing applications. Given that discrete structures serves as a foundation for many other areas in computing, it is worth noting that the boundary between discrete structures and other areas, particularly Algorithms and Complexity, Software Development Fundamentals, Programming Languages, and Intelligent Systems, may not always be crisp.  Indeed, different institutions may choose to organize the courses in which they cover this material in very different ways.  Some institutions may cover these topics in one or two focused courses with titles like 'discrete structures' or 'discrete mathematics', whereas others may integrate these topics in courses on programming, algorithms, and/or artificial intelligence.  Combinations of these approaches are also prevalent (e.g., covering many of these topics in a single focused introductory course and covering the remaining topics in more advanced topical courses). Computer graphics is the term commonly used to describe the computer generation and manipulation of images. It is the science of enabling visual communication through computation. Its uses include cartoons, film special effects, video games, medical imaging, engineering, as well as scientific, information, and knowledge visualization. Traditionally, graphics at the undergraduate level has focused on rendering, linear algebra, and phenomenological approaches. More recently, the focus has begun to include physics, numerical integration, scalability, and special-purpose hardware.  In order for students to become adept at the use and generation of computer graphics, many implementation-specific issues must be addressed, such as file formats, hardware interfaces, and application program interfaces. These issues change rapidly, and the description that follows attempts to avoid being overly prescriptive about them. The area encompassed by Graphics and Visualization is divided into several interrelated fields: • Fundamentals: Computer graphics depends on an understanding of how humans use vision to perceive information and how information can be rendered on a display device. Every computer scientist should have some understanding of where and how graphics can be appropriately applied as well as the fundamental processes involved in display rendering. • Modeling: Information to be displayed must be encoded in computer memory in some form, often in the form of a mathematical specification of shape and form.  • Rendering: Rendering is the process of displaying the information contained in a model. • Animation: Animation is the rendering in a manner that makes images appear to move and the synthesis or acquisition of the time variations of models. • Visualization: The field of visualization seeks to determine and present underlying correlated structures and relationships in data sets from a wide variety of application areas. The prime objective of the presentation should be to communicate the information in a dataset so as to enhance understanding • Computational Geometry: Computational Geometry is the study of algorithms that are stated in terms of geometry Human-computer interaction (HCI) is concerned with designing interactions between human activities and the computational systems that support them, and with constructing interfaces to afford those interactions. Interaction between users and computational artefacts occurs at an interface that includes both software and hardware. Thus interface design impacts the software life-cycle in that it should occur early; the design and implementation of core functionality can influence the user interface – for better or worse.Because it deals with people as well as computational systems, as a knowledge area HCI demands the consideration of cultural, social, organizational, cognitive and perceptual issues. Consequently it draws on a variety of disciplinary traditions, including psychology, ergonomics, computer science, graphic and product design, anthropology and engineering. Information assurance and security as a domain is the set of controls and processes both technical and policy intended to protect and defend information and information systems by ensuring their confidentiality, integrity,andavailability, andby providing for authentication and non-repudiation.  The concept of assurance also carries an attestation that current and past processes and data are valid.  Both assurance and security concepts are needed to ensure a complete perspective. Information assurance and security education, then, includes all efforts to prepare a workforce with the needed knowledge, skills, and abilities to protect our information systems and attest to the assurance of the past and current state of processes and data.  The importance of security concepts and topics has emerged as a core requirement in the Computer Science discipline, much like the importance of performance concepts has been for many years. Information Management is primarily concerned with the capture, digitization, representation, organization, transformation, and presentation of information; algorithms for efficient and effective access and updating of stored information; data modeling and abstraction; and physical file storage techniques. The student needs to be able to develop conceptual and physical data models, determine which IM methods and techniques are appropriate for a given problem, and be able to select and implement an appropriate IM solution that addresses relevant design concerns including scalability, accessibility and usability. Artificial intelligence (AI) is the study of solutions for problems that are difficult or impractical to solve with traditional methods. It is used pervasively in support of everyday applications such as email, word-processing and search, as well as in the design and analysis of autonomous agents that perceive their environment and interact rationally with the environment. The solutions rely on a broad set of general and specialized knowledge representation schemes, problem solving mechanisms and learning techniques. They deal with sensing (e.g., speech recognition, natural language understanding, computer vision), problem-solving (e.g., search, planning), and acting (e.g., robotics) and the architectures needed to support them (e.g., agents, multi-agents). The study of Artificial Intelligence prepares the student to determine when an AI approach is appropriate for a given problem, identify the appropriate representation and reasoning mechanism, and implement and evaluate it The Internet and computer networks are now ubiquitous and a growing number of computing activities strongly depend on the correct operation of the underlying network. Networks, both fixed and mobile, are a key part of the computing environment of today and tomorrow. Many computing applications that are used today would not be possible without networks. This dependency on the underlying network is likely to increase in the future. The high-level learning objective of this module can be summarized as follows: • Thinking in a networked world. The world is more and more interconnected and the use of networks will continue to increase. Students must understand how the networks behave and the key principles behind the organization and operation of the networks. • Continued study. The networking domain is rapidly evolving and a first networking course should be a starting point to other more advanced courses on network design, network management, sensor networks, etc. • Principles and practice interact. Networking is real and many of the design choices that involve networks also depend on practical constraints. Students should be exposed to these practical constraints by experimenting with networking, using tools, and writing networked software. There are different ways of organizing a networking course. Some educators prefer a top-down approach, i.e., the course starts from the applications and then explains reliable delivery, routing and forwarding. Other educators prefer a bottom-up approach where the students start with the lower layers and build their understanding of the network, transport and application layers later  An operating system defines an abstraction of hardware and manages resource sharing among the computer’s users. The topics in this area explain the most basic knowledge of operating systems in the sense of interfacing an operating system to networks, teaching the difference between the kernel and user modes, and developing key approaches to operating system design and implementation. This knowledge area is structured to be complementary to the Systems Fundamentals (SF), Networking and Communication (NC), Information Assurance and Security (IAS), and the Parallel and Distributed Computing (PD)knowledge areas. The Systems Fundamentals and Information Assurance and Security knowledge areas are the new ones to include contemporary issues. For example, Systems Fundamentals includes topics such as performance, virtualization and isolation, and resource allocation and scheduling; Parallel and Distributed Systems includes parallelism fundamentals; and and Information Assurance and Security includes forensics and security issues in depth.  Many courses in Operating Systems will draw material from across these knowledge areas. Platform-based development is concerned with the design and development of software applications that reside on specific software platforms.  In contrast to general purpose programming, platform-based development takes into account platform-specific constraints.  For instance web programming, multimedia development, mobile computing, app development, and robotics are examples of relevant platforms thatprovide specific services/APIs/hardware thatconstrain development.  Such platforms are characterized by the use of specialized APIs, distinct delivery/update mechanisms, and being abstracted away from the machine level. Platform-based development may be applied over a wide breadth of ecosystems. While we recognize that some platforms (e.g., web development) are prominent, we are also cognizant of the fact that no particular platform should be specified as a requirement in the CS2013 curricular guidelines.  Consequently, this Knowledge Area highlights many of the platforms that have become popular, without including any such platform in the core curriculum.  We note that the general skill of developing with respect to an API or a constrained environment is covered in other Knowledge Areas, such as Software Development Fundamentals (SDF).  Platform-based development further emphasizes such general skills within the context of particular platforms. The past decade has brought explosive growth in multiprocessor computing, including multi-core processors and distributed data centers.   As a result, parallel and distributed computing has moved from a largely elective topic to become more of a core component of undergraduate computing curricula.  Both parallel and distributed computing entail the logically simultaneous execution of multiple processes, whose operations have the potential to interleave in complex ways.  Parallel and distributed computing builds on foundations in many areas, including an understanding of fundamental systems concepts such as concurrency and parallel execution, consistency in state/memory manipulation, and latency.  Communication and coordination among processes is rooted in the message-passing and shared-memory models of computing and such algorithmic concepts as atomicity, consensus, and conditional waiting. Achieving speedup in practice requires an understanding of parallel algorithms, strategies for problem decomposition, system architecture, detailed implementation strategies, and performance analysis and tuning. Distributed systems highlight the problems of security and fault tolerance, emphasize the maintenance of replicated state, and introduce additional issues that bridge to computer networking. Because parallelism interacts with so many areas of computing, including at least algorithms, languages, systems, networking, and hardware, many curricula will put different parts of the knowledge area in different courses, rather than in a dedicated course.  While we acknowledge that computer science is moving in this direction and may reach that point, in 2013 this process is still in flux and we feel it provides more useful guidance to curriculum designers to aggregate the fundamental parallelism topics in one place.  Note, however, that the fundamentals of concurrency and mutual exclusion appear in the Systems Fundamentals (SF)Knowledge Area.  Many curricula may choose to introduce parallelism and concurrency in the same course (see below for the distinction intended by these terms).  Further, we note that the topics and learning outcomes listed below include only brief mentions of purely elective coverage.  At the present time, there is too much diversity in topics that share little in common (including for example, parallel scientific computing, process calculi, and non-blocking data structures) to recommend particular topics be covered in elective courses. Programming languages are the medium through which programmers precisely describe concepts, formulate algorithms, and reason about solutions.  In the course of a career, a computer scientist will work with many different languages, separately or together.  Software developers must understand the programming models underlying different languages and make informed design choices in languages supporting multiple complementary approaches.  Computer scientists will often need to learn new languages and programming constructs, and must understand the principles underlying how programming language features are defined, composed, and implemented. The effective use of programming languages, and appreciation of their limitations, also requires a basic knowledge of programming language translation and static program analysis, as well as run-time components such as memory management. Fluency in the process of software development is a prerequisite to the study of most of computer science.  In order to use computers to solve problems effectively, students must be competent at reading and writing programs in multiple programming languages.  Beyond programming skills, however, they must be able to design and analyze algorithms, select appropriate paradigms, and utilize modern development and testing tools.  This knowledge area brings together those fundamental concepts and skills related to the software development process.  As such, it provides a foundation for other software-oriented knowledge areas, most notably Programming Languages, Algorithms and Complexity, and Software Engineering.   It is important to note that this knowledge area isdistinct from the old Programming Fundamentals knowledge area from CC2001.  Whereas that knowledge area focused exclusively on the programming skills required in an introductory computer science course, this new knowledge area is intended to fill a much broader purpose.  It focuses on the entire software development process, identifying those concepts and skills that should be mastered in the first year of a computer science program.  This includes the design and simple analysis of algorithms, fundamental programming concepts and data structures, and basic software development methods and tools.  As a result of its broader purpose, the Software Development Fundamentals knowledge area includes fundamental concepts and skills that could naturally be listed in other software-oriented knowledge areas (e.g., programming constructs from Programming Languages, simple algorithm analysis from Algorithms & Complexity, simple development methodologies from Software Engineering). Likewise, each of these knowledge areas will contain more advanced material that builds upon the fundamental concepts and skills listed here. While broader in scope than the old Programming Fundamentals, this knowledge area still allows for considerable flexibility in the design of first-year curricula.  For example, the Fundamental Programming Concepts unit identifies only those concepts that are common to all programming paradigms.  It is expected that an instructor would select one or more programming paradigms (e.g., object-oriented programming, functional programming, scripting) to illustrate these programming concepts, and would pull paradigm-specific content from the Programming Languages knowledge area to fill out a course.  Likewise, an instructor could choose to emphasize formal analysis (e.g., Big-Oh, computability) or design methodologies (e.g., team projects, software life cycle) early, thus integrating hours from the Programming Languages, Algorithms and Complexity, and/or Software Engineering knowledge areas.  Thus, the 43 hours of material in this knowledge area will typically be augmented with core material from one or more of these knowledge areas to form a complete and coherent first-year experience. In every computing application domain, professionalism, quality, schedule, and cost are critical to producing software systems. Because of this, the elements of software engineering are applicable to developing software in all areas of computing.  A wide variety of software engineering practices have been developed and utilized since the need for a discipline of software engineering was first recognized. Many trade-offs between these different practices have also been identified. Practicing software engineers have to select and apply appropriate techniques and practices to a given development effort in order to maximize value. To learn how to do so, they study the elements of software engineering.Software engineering is the discipline concerned with the application of theory, knowledge, and practice to effectively and efficiently build reliable software systems that satisfy the requirements of customers and users. This discipline is applicable to small, medium, and large-scale systems. It encompasses all phases of the lifecycle of a software system, including requirements elicitation, analysis and specification; design; construction; verification and validation; deployment; and operation and maintenance. Whether small or large, following a traditional plan-driven development process, an agile approach, or some other method, software engineering is concerned with the best way to build good software systems.Software engineering uses engineering methods, processes, techniques, and measurements. It benefits from the use of tools for managing software development; analyzing and modeling software artifacts; assessing and controlling quality; and for ensuring a disciplined, controlled approach to software evolution and reuse. The software engineering toolbox has evolved over the years. For instance, the use of contracts, with requires and ensure clauses and class invariants, is one good practice that has become more common. Software development, which can involve an individual developer or a team or teams of developers, requires choosing the most appropriate tools, methods, and approaches for a given development environment. Students and instructors need to understand the impacts of specialization on software engineering approaches. For example, specialized systems include: • Real time systems • Client-server systems • Distributed systems•Parallel systems • Web-based systems • High integrity systems • Games• Mobile computing • Domain specific software (e.g., scientific computing or business applications) The underlying hardware and software infrastructure upon which applications are constructed is collectively described by the term 'computer systems.' Computer systems broadly span the sub-disciplines of operating systems, parallel and distributed systems, communications networks, and computer architecture.  Traditionally, these areas are taught in a non-integrated way through independent courses. However these sub-disciplines increasingly share important common fundamental concepts within their respective cores.  These concepts include computational paradigms, parallelism, cross-layer communications, state and state transition, resource allocation and scheduling, and so on.  The Systems Fundamentals Knowledge Area is designed to present an integrative view of these fundamental concepts in a unified albeit simplified fashion, providing a common foundation for the different specialized mechanisms and policies appropriate to the particular domain area. Undergraduates also need to understand the basic cultural, social, legal, and ethical issues inherent in the discipline of computing. They should understand where the discipline has been, where it is, and where it is heading. They should also understand their individual roles in this process, as well as appreciate the philosophical questions, technical problems, and aesthetic values that play an important part in the development of the discipline.Students also need to develop the ability to ask serious questions about the social impact of computing and to evaluate proposed answers to those questions. Future practitioners must be able to anticipate the impact of introducing a given product into a given environment. Will that product enhance or degrade the quality of life? What will the impact be upon individuals, groups, and institutions? Finally, students need to be aware of the basic legal rights of software and hardware vendors and users, and they also need to appreciate the ethical values that are the basis for those rights. Future practitioners must understand the responsibility that they will bear, and the possible consequences of failure. They must understand their own limitations as well as the limitations of their tools. All practitioners must make a long-term commitment to remaining current in their chosen specialties and in the discipline of computing as a whole. As technological advances continue to significantly impact the way we live and work, the critical importance of social issues and professional practice continues to increase; new computer-based products and venues pose ever more challenging problems each year.  It is our students who must enter the workforce and academia with intentional regard for the identification and resolution of these problems."

	list := word.NewDigester(description)
	list.SortByCounter("DESC")
	list.Describe()

	final := word.NewGrammarFilter(list).GetData()
	final.Describe()

	controller.ShowBlankPage()
}
Esempio n. 3
0
func (controller *Digester) work() {
	controller.GenerateXSRF()
	text := strings.TrimSpace(controller.GetString("digester-text"))
	if len(text) > maxLenText {
		text = text[0:50000]
	}
	list := word.NewDigester(text)
	list.SortByCounter("DESC")
	result := word.NewGrammarFilter(list).GetData()
	controller.Data["originalText"] = text
	controller.Data["processed"] = result
	controller.LoadTemplate("digester")
}
Esempio n. 4
0
func (analyser ProgramAnalyser) digest(text string) *word.Digester {
	digester := word.NewDigester(text)
	result := word.NewGrammarFilter(digester).GetData()
	result.RemoveOccurence(analyser.program.GetCode())
	return result
}
Esempio n. 5
0
func (analyser SubjectAnalyser) digest(text string) *word.Digester {
	digester := word.NewDigester(text)
	return word.NewGrammarFilter(digester).GetData()
}