Project logistics
- Mentor: Ata Turk. email: ataturk-at-bu-dot-edu
- Min-max team size: 3-5
- Expected project hours per week (per team member): 6-8
- Will the project be open source? Open for discussion.
Preferred past experience
Some experience in one or more of the following is preferred in possible team members:
- Indexing, Query processing (building inverted indexes, etc... Preferably some experience in Lucene, Solr)
- Crawling (preferably Nutch)
- Map/Reduce (Hadoop)
- NoSQL systems ( Casssandra, MongoDB)
What is buNews about?
- A Flipboard-like system
- Crawl news articles from well-respected sources
- Parse relevant content on each article,
- Categorize articles,
- Put articles into user's timelines based on their news preferences,
- Enable users to create their own news magazines (by picking and adding news articles to their magazines) and enable users to enroll to other users' magazines
- Build an inverted index of the articles
- Respond to user queries with 10 blue links of the most relevant news articles
- Use intelligent ranking mechanisms (not just relevance, newer the better)
Some Technologies Expected To Be Learned/Used
- Map/Reduce, Hadoop
- NoSQL for managing users/magazines/articles/...
- Crawling (e.g., Nutch)
- Indexing, and Query Processing (Solr)
- Categorization