2023 Must be similar to screenshots I must be able to run the projects on Eclipse | Assignment Collections

Computer Science 2023 Java Programming Advance Three Projects

2023 Must be similar to screenshots I must be able to run the projects on Eclipse | Assignment Collections

  Must be similar to screenshots

I must be able to run the projects on Eclipse so that I can upload the codes to my Github account

The projects must say that they were created by 

  • Juliet      Mercado
  • Zachary      Willis
  • Ihor      Panchenko
  • Craig      Anderson

Building a Search Engine, Part I: Governance, Workflow, and UI

(This is the first project in this series)

You are going to design, build, and test a scaled-down version of “Google Search”. Rather than searching the Internet’s files, you will only search local files added to your search engine’s index. Your search engine will allow an administrator to add, update, and remove files from the index. Users will be able to enter search terms, and select between Boolean AND, OR, or PHRASE search. The matching file names (if any) are then displayed in a list.

You also need to design the system architecture (the high-level design), so you can plan each part. 

Search Engine Project Proposal:

Build a search engine with simple GUI, that can do AND, OR, and PHRASE Boolean searches on a small set of text files. The user should be able to say the type of search to do, and enter some search terms. The results should be a list of file pathnames that match the search. This should be a stand-alone application

User Interfaces

In addition to the main user interface (for doing searching), you will need a separate administrator or maintenance interface to manage your application. It should be easy to add and remove files (from the set of indexed files), and to regenerate the index anytime. When starting, your application should check if any of the files have been changed or deleted since the application last saved the index. If so, the administrator should be able to have the index updated with the modified file(s).

Note that with HTML, Word, or other types of documents, you would need to extract a plain text version before indexing. That isn’t hard, but the search engine is complex enough already. For these projects, limit your search engine to only plain text files (including .txt, .html, and other text files).

The index must be stored on disk, so next time your application starts it can reload its data. The index, list of files, and other data, can be stored in one or more file(s) or in a database. The saved data should be read whenever your application starts. The saved data should be updated (or recreated) when you add, update, or remove documents from your set (of indexed documents), or perhaps just when your application exits. If you use files, the file formats are up to you; have a format that is fast and simple to load and store.

To keep things as simple as possible, in this project you can assume that only a small set of documents will be indexed, and thus the whole index can be kept in memory at once. (That’s probably not the case for Google’s data!) All you need to do is be able to read the index data from disk at startup into memory, and write it back either when updating the index, or when your application shuts down. Note, the names (pathnames) of the added files as well as their last modification time must be stored in addition to the index.

If using XML file, you can define an XML schema for it and have some tool such as Notepad++ validate your file format for you. XML may have other benefits, but it isn’t as simple as using plain text files. JSON might be the easiest format for storing and reading the index data. In any case, don’t forget to include the list of file pathnames and other data you decide is needed, along with the index itself.

Requirements:

In this project, we will follow the model-view-controller design pattern for the project organization. This allows one to develop each part mostly independently from the other parts.

Develop Stub User Interfaces:

In this part of the project, you must implement a non-functional (that means looks good but doesn’t do a thing) graphic user interface for the application. (The “view”.) The main (default) user interface must support searching and displaying results. It should have various other features, such as an “About…” menu or button, a way to quit the application (if a stand-alone application; if your group creates a web application, there is no need to quit), and a way to get to the administrator/maintenance view.

The maintenance/administrator view must allow the user to perform various administration operations: view the list of indexed file names, adding files to the index, remove files from the index, and update the index (when files have been modified since they were indexed).

The user interface should be complete, but none of the functionality needs to be implemented at this time. You should implement stub methods for the functionality not yet implemented, and invoke them from your event handlers. The stub methods can either return “canned” (fake but realistic) data, or throw an OperationNotSupported exception. The only button that needs to do anything is the one used to switch to the maintenance view.

Since the user interfaces don’t do anything, there is nothing to test yet. However, you must create a test class with at least one test method (it can just return success if you wish). I suggest you agree to use JUnit 4 style tests for now.

Building a Search Engine, Part II: Persistent Data

Please read the background information and full project description from Search Engine Project, Part I. In this project, you will implement the persistent data (the “model”) part of the project: the saving of data and the loading of data at the next start. The persistent data contains the list of files used in the index, and the index itself.

First discuss which persistence solution you will use: text files, XML or JSON files, or a database (and chose between embedded (my suggestion) or server, and if using a database, chose between the JDBC and JPA database APIs (I suggest JPA). You can make this decision before knowing the details of the data structures used.

Before working on actual code, you need to decide on the data structures to be used for the file list and the inverted index. Try to read the Java collections material before deciding. 

It should be easy to add and remove files (from the set of indexed files). When starting, your application should check if any of the files used have been changed or deleted since the application last saved the index. If so, the “admin” user should be able to have the inverted index file(s) updated, from the maintenance interface.

(Note that with HTML or Word documents, you would need to extract a plain text version before indexing.) In this project, all the “indexible” files are plain text. You are free to assume the system-default text file encoding, or assume UTF-8 encoding, for all files.

The inverted index can be stored in one or more file(s), and that should be read whenever your application starts. The file(s) should be updated (or recreated) when you add, update, or remove documents from your set (of indexed documents). The file format is up to you, but should have a format that is fast and simple to search. However, to keep things simpler, in this project you can assume that only a small set of documents will be indexed, and thus the whole index can be kept in memory. All you need to do is be able to read the index data from a file at startup into memory, and write it back when updating the index. Don’t forget the names (pathnames) of the files as well as their last modification time must be stored as well. It is your choice to use a single file or multiple files, in plain text, JSON, XML, or any format your group chooses, to hold the persistent data. If you want, you can use any DBMS. (In that case, I suggest using the JavaDB included with the JDK, as an embedded database.) In any case, your file format(s) or database schema must be documented completely, so that someone else, without access to your source code could use your file(s) or database correctly.

If using XML format, you can define an XML schema for your file and have some tool such as Notepad++ validate your file format for you. XML may have other benefits, but it isn’t as simple as plain text files or even JSON files. In any case, don’t forget to include the list of file (path) names, along with the index itself, in your persistent data store.

Part II Requirements:

In this part, you must implement the file operations of your search engine application (the model). That includes reading and updating your persistent data (that is, the inverted index as well as any other information you need to store between runs of your application, such as the list of files (their pathnames) that have been indexed). The main file operations are reading each file to be indexed a “word” at a time; you also need to checking if the previously indexed files still exist or have been modified since last indexed.

The maintenance part of the user interface should allow users to select files for indexing, and to keep track of which files have been added to the index. For each file, you need to keep the full pathname of the file as well as the file’s last modification time. Your code should correctly handle the user entering in non-existent files and unreadable files. How you handle such errors is up to you 

You can download a Search Engine model solution, to play with it and inspect its user interface. My solution keeps all persistent data in a single text file in the user’s home directory, but you can certainly use a different persistence solution.

Possible Data Structures you can use. In part III, you will implement the index operations, including Boolean searching, adding to the index, and removing files from the index. (The index is a complex collection of collections.) Because the format of the index and file list will affect the code used to read and write them to and from storage, you must decide on the in-memory data structures to be used early. In the model solution, I used a List of FileItem objects for the list of indexed files; each FileItem contained a file’s pathname and date it was read for the index. The index data itself is stored in a Map, with the using the indexed words as keys, and a Set of IndexData objects as the values. Each IndexData object holds the id of the file containing the word and the position of the word in that document. (The classes FileItem and IndexData were trivial to write.)

This is NOT the only, or the best, way to represent the index or file list! (For example, a List of int[2] arrays might be simpler than a Set of IndexData objects.) Your should decide on the types of collections used. Only then can you implement the methods to read and write the data.

Building a Search Engine, Part III: Collections

Please read the background information and full project description from Search Engine Project, Part I.

In this final part of the project, you will complete the application by implementing the index functions. These include adding a file to the index, and removing a file from the index, and reading and writing the index from/to a file. (Updating the index when a file has been changed, can then be done by removing and then re-adding a file.) Other operations include searching the index for a given word, and returning a Set of pairs (document ID and position) for that word.

Finally, you will have to implement the Boolean search functions of the main user interface. (This is complex enough, that it should have been another project!) I suggest you start with an “OR” search, then worry about implementing the “AND” and “PHRASE” search functions.

When building the index, keep in mind you will need to define what you mean by “word”. One possibility is to strip out any non-digits or letters, and convert the result to all lowercase, both when you build the inverted index and when you read the search terms entered by the user. Ideally, you can use the I18N methods discussed in class to normalize the words.

Implementing Boolean Search:

The exact method depends in part on how you implement the inverted index. In the suggested implementation (a Map with words as the keys, and a List or Set of (document ID, position) pairs as the values), you could implement the Boolean searches using algorithms similar to the following (you can come up with your own if you wish):

OR Search

This is the easiest one to implement. The general idea is to start with an empty Set of matching files. Then add to that Set, the files containing each search term; Just search the Map for that word, and add each document found (if any). The result is the OR search results, the files that contain any word in the search list. (If user inputs no search words, say “ ,.”, then no files are considered as matching.)

AND Search

This is done the opposite way from an OR search, and is only a little harder to implement. The idea is to start with a set of all files in the index. Then for each search term, for each file in the Set, make sure that file is contained in the index for that search term. Remove any files from the set that don’t contain that word. The resulting final set is the documents matching all search terms. (If user inputs no search words, say “ ,.”, then all files are considered as matching. If that isn’t the behavior you want, you need to treat that as a special case.)

PHRASE Search

This is the hardest search to implement. Unlike the OR and the AND searches, with PHRASE searching, the position of the search terms in the files matters. The algorithm I came up with is:

Create an initially empty Set of Pair objects.

Add to the set the Pair objects for the files that contain the first word of the phrase. This is the easy part: Just lookup that word in the Map, and add all Pair objects found to a set.

The Set now contains Pair objects for just the files that might contain the phrase. Next, loop over the remaining words of the phrase, removing any Pairs from the set that are no longer possible phrase continuations. (Actually, I just build a new Set.)

For each remaining word in the phrase:

Create a new, empty set of Pairs.

For each Pair in the previous set, see if the word appears in the same file, but in the next position. If so, add the Pair object for the word to the new set.

An example may help clarify this. Suppose the search phrase is “big top now”. The set initially contains all the Pair objects for the word “big”. Let’s say for example, that set looks like:

(file1,position7), (file1,position22), (file3,position4)

For each Pair object in that set, you need to see if “top” is in that same file, but the next position. If so, you add the Pair object for that to the new Set. The (inner) loop for this example checks each of the following:

Is a (file1,position8) Pair object in the Map for the word “top”?

Is a (file1,position23) Pair object in the Map for the word “top”?

Is a (file3,position5) Pair object in the Map for the word “top”?

If the answer is “yes”, then add that Pair object to the new set. When this loop ends, the new set will contain the Pair objects for the phrase “big top” (pointing to the position of the word “top”).

For example, suppose “top” is only found in (file1,position8) and (file3,position5). You replace the first set with this new set:

(file1,position8), (file3,position5)

Repeat for the next word in the phrase, using the set built in the previous loop.

Continue until the set is empty (so phrase not found), or until the last word of the phrase has been processed. The Pair objects remaining in the final set are the ones that contain the phrase; the position will be that of the last word of the phrase. (We only need to display the file name; in this project, the position of the phrase doesn’t matter.)

Part III Requirements:

This project has been split into three parts. Each part counts as a separate project. In the first two parts, you designed and implemented a graphic user interface for the application, and added all required file operations.

In this part, you must implement the remaining operations of your search engine application: the index operations, and the searching.

You can download a Search Engine model solution, to play with it and inspect its user interface, but please keep in mind you should not copy that user interface; instead, invent a better, nicer-looking one.

Hints:

Keep your code as simple as possible

The inverted index is naturally a Map, from words (the keys) to a Set of objects (the values). Each of the objects represent a document and a location within that document, where the word was found. I called these objects Pairs, since they are a pair of numbers, but you can use any name for your classes. Note, you will need to be able to go from a document number to a file name, when you display the search results.

 

We give our students 100% satisfaction with their assignments, which is one of the most important reasons students prefer us to other helpers. Our professional group and planners have more than ten years of rich experience. The only reason is that we have successfully helped more than 100000 students with their assignments on our inception days. Our expert group has more than 2200 professionals in different topics, and that is not all; we get more than 300 jobs every day more than 90% of the assignment get the conversion for payment.

Place Order Now

#write essay #research paper #blog writing #article writing #academic writer #reflective paper #essay pro #types of essays #write my essay #reflective essay #paper writer #essay writing service #essay writer free #essay helper #write my paper #assignment writer #write my essay for me #write an essay for me #uk essay #thesis writer #dissertation writing services #writing a research paper #academic essay #dissertation help #easy essay #do my essay #paper writing service #buy essay #essay writing help #essay service #dissertation writing #online essay writer #write my paper for me #types of essay writing #essay writing website #write my essay for free #reflective report #type my essay #thesis writing services #write paper for me #research paper writing service #essay paper #professional essay writers #write my essay online #essay help online #write my research paper #dissertation writing help #websites that write papers for you for free #write my essay for me cheap #pay someone to write my paper #pay someone to write my research paper #Essaywriting #Academicwriting #Assignmenthelp #Nursingassignment #Nursinghomework #Psychologyassignment #Physicsassignment #Philosophyassignment #Religionassignment #History #Writing #writingtips #Students #universityassignment #onlinewriting #savvyessaywriters #onlineprowriters #assignmentcollection #excelsiorwriters #writinghub #study #exclusivewritings #myassignmentgeek #expertwriters #art #transcription #grammer #college #highschool #StudentsHelpingStudents #studentshirt #StudentShoe #StudentShoes #studentshoponline #studentshopping #studentshouse #StudentShoutout #studentshowcase2017 #StudentsHub #studentsieuczy #StudentsIn #studentsinberlin #studentsinbusiness #StudentsInDubai #studentsininternational