Various positions to work on research-paper recommender systems (Mr. DLib) and Docear (Bachelor/Master/PhD/Post-Doc)

Here at Docear and Mr. DLib we have many exciting projects in the field of recommender systems, user modelling, personalisation, and adaptive systems (primarily with a focus on digital libraries but we are also open for domains such as health care, transportation, and tourism). If you are interested in pursuing any of the projects as part of a Bachelor, Master, or PhD thesis, as post-doctoral researcher, or as a short-term research internship, read on.

The projects

We have a number of interesting projects but please consider the following list only as a suggestion. If you have your own ideas, you are very welcome to discuss them with us. In general we are interested in anything relating to user modelling, personalization, and recommender systems in the domains of digital libraries, tourism, health care, and transportation. However, our recommender systems and reference management tools are real-world applications —  hence, we are also interested in topics such as software design and architecture, scalability, parallelization, and databases.

Whatever project you choose, we can guarantee that you will be working on a real-world software system being used by tens of thousands of researchers around the globe. You will gain valuable insights into research and software development that will help you in both, pursuing a career in the industry and academia.

Mr. DLib

Mr. DLib is our new project that offers literature recommendations as a service. The service is currently used by our pilot partner GESIS, and soon it will be integrated into JabRef (one of the most popular reference managers), and further partners are soon to come, too. Mr. DLib already delivers more than 10,000 recommendations per day, and we except to deliver even more once JabRef and further partners are using Mr. DLib. Delivering so many recommendations to a variety of partners is the ideal field for conducting research about recommender systems. Actually, a lot of current research in the domain of (literature) recommender systems suffers from artificial lab scenarios and little relevance for real-world applications.  When you conduct your research about recommender systems with Mr. DLib you can be sure that your results will have a high practical relevance and help to shape the way academics receive recommendations in the real word.

A Novel Recommendation Approach for Scientific Recommender Systems

There are more than 90 approaches to give research-­paper recommendations. Your goal will be to develop a novel research­-paper recommendation approach that is more effective than those ones currently being available. To achieve the goal, you will be integrating some of the existing approaches into Mr. DLib’s “Recommender System as a Service”, and either develop a completely new approach, or enhance the existing ones. We have many ideas how existing approaches could be enhanced. For instance, the ranking process could be improved by using Scientometrics (e.g. citation counts of papers, h-­index, …) or machine learning, and there are many options more that we are happy to share with you in a personal discussion.

Reproducibility in Recommender-System Research

The reproducibility of experimental results is the “fundamental assumption” in science, and the “cornerstone” for drawing meaningful conclusions. Recently, we found that reproducibility is rarely given in the recommender-­systems community, particularly in the research-­paper recommender-system community. In a review of 90 research­-paper recommender­ systems, we identified several cases in which only slight variations in the initial set­up of the evaluation or approaches led to surprisingly different results.

We want to find out, how recommender systems should be ideally implemented and evaluated to ensure reproducible results. To achieve the goal,

  • you will implement a number of research-­paper recommendation approaches (or simply use some existing ones),
  • then these approaches will be used to give recommendations to the users of the applications that use Mr. DLib’s recommender system (GESIS, JabRef, Docear, and maybe others)
  • you will analyze how the recommendation approaches perform in the different scenarios and you try to identify the factors that affect the recommendation effectiveness. This includes making controlled changes to the algorithms and applications that display the recommendations.

If you are interested in this project, please have a look at our paper that was published in the journal “User Modeling and User­ Adapted Interaction (UMAI)”. The paper gives you a detailed overview on the topic of reproducibility. Your task would be to continue what we outlined in that paper.

Docear

Docear is a reference manager, comparable to tools like JabRef, Endnote, Zotero, or Mendeley. In the past year(s) the development of Docear paused somewhat, but we would love to have someone who brigs new life to the project. There are many things to do, like updating the code base to the latest Freeplane and JabRef versions, developing a web-based version, or creating a new workspace concept. Of course, there is also lot’s of research to do. For instance, you could analyze how researchers are using mind-maps to manage literature, or work on modelling the user’s interests.

Requirements

For any of the projects you need the following skills

  • JAVA
  • Linux
  • Databases (mySQL)
  • English proficiency (oral and written)
  • Academic Writing

Skills in the following areas are a plus

  • Recommender Systems
  • Machine Learning
  • Tomcat
  • Apache Lucene/Solr
  • Apache Mahout
  • JAVA Jersey
  • Non-SQL databases (especially graph databases such as neo4j)
  • Web Services
  • XML
  • JSON
  • JIRA and Confluence
  • Git/GitHub
  • Social Network Analysis
  • Distributed File Storage
  • PDF processing
  • Web Crawling
  • Web Technologies
  • Other programming languages including C/++/#, Python, or JavaScript
  • WordPress

The Locations & Supervisors

Trinity College Dublin (TCD), Ireland / Prof. Joeran Beel

Trinity College Dublin is Ireland’s most prestigious university and among the top universities in Europe. It is located directly in the city center with a beautiful campus. Trinity College retains a tranquil collegiate atmosphere despite its location in the centre of a capital city (and despite it being one of the most significant tourist attractions in Dublin). This is in large part due to the compact design of the college, whose main buildings look inwards and are arranged in large quadrangles (called squares), and the existence of only a few public entrances [1].

Your supervisor at Trinity College Dublin will be Joeran Beel, founder of Docear and Mr. DLib. Joeran is Assistant Professor of Intelligent Systems in the school of computer science and statistics (from December 1st, 2016). He is also affiliated with the ADAPT Centre, which is a research centre closely cooperating with industry partners such as eBay, Intel, Microsoft, and PayPal. Joeran’s research focuses on recommender systems: recommendations as a service, recommender-system evaluation, and user modelling in the domain of digital libraries. Joeran has published three books and over 40 peer-reviewed articles, and has been awarded various grants for research projects, patent applications, and prototype development as well as some business start-up funding. He is involved in the development of several open-source projects such as Docear, Mr. DLib, JabRef, and Freeplane. Joeran founded two successful IT start-ups and received multiple awards and prizes for each. He studied and researched in Australia (Sydney), the US (Berkeley), Germany (Magdeburg & Konstanz), Cyprus (Nicosia) and England (Lancaster). He has a M.Sc. in Project Management, a M.Sc. in Business Information Systems and a Ph.D. in Computer Science. Prior to TCD, Joeran worked as IT product manager in the tourism industry (Munich, Germany), and as post-doctoral researcher at the National Institute of Informatics in Tokyo, Japan.

Trinity_College_Dublin,_Parliament_Square

Source: https://upload.wikimedia.org/wikipedia/commons/6/65/Trinity_College_Dublin%2C_Parliament_Square.jpg

Dublin is the capital city of Ireland. Its vibrancy, nightlife and tourist attractions are renowned and it is the most popular entry point for international visitors to Ireland. It’s disproportionately large for the size of Ireland with nearly two million in the Greater Dublin Region – well over a third of the Republic’s population! The centre is, however, relatively small and can be navigated by foot, with most of the population living in suburbs. Being subject to the moderating effects of the Atlantic Ocean and the Gulf Stream, Dublin is known for its mild climate. Contrary to some popular perception, the city is not especially rainy. [2]

Konstanz University, Germany / Prof. Bela Gipp

The University of Konstanz (German: Universität Konstanz) is a university in the city of Konstanz in Baden-Württemberg, Germany. It was founded in 1966, and the main campus on the Gießberg was opened in 1972. The University is situated on the shore of Lake Constance just four kilometers from the Swiss border. As one of eleven German Excellence Universities, University of Konstanz is consistently ranked among the global top 250 by the Times Higher Education World University Rankings. In 2016 it was ranked 7th globally by the Times Higher Education 150 under 50 rankings. It is often referred to as “small Harvard” by the German media. [3]

Your supervisor at Konstanz University would be Bela Gipp. Bela Gipp serves as Juniorprofessor in Information Science at the University of Konstanz. Prior this, he was a post-doctoral researcher at the University of California Berkeley and at the National Institute of Informatics in Tokyo. From 2009 to 2013 he received an invitation from the University of California Berkeley, to conduct his doctoral research at the School of Information and the Department of Statistics, while being an external doctoral student at the University Magdeburg, Germany. He received his Ph.D. in Computer Science with distinction. His research interests are information retrieval and visualization, knowledge management systems and web technologies. His recent work focuses on developing semantic methods that analyze links, citations and other language independent characteristics to improve recommender systems and plagiarism detection software. He also researches the use of mind maps for different knowledge management tasks. He has published over 40 articles, three books and filed four patents. As a founder of SciPlore he develops the open-source software Docear. Starting at the age of 16, he won several prizes at Jugend Forscht, Germany’s premier youth science competition. The German Chancellor honored him after winning the 1st place in the state-level round three times in a row. Today, he acts as a jury member for Jugend Forscht. Scholarships allowed him to study abroad in Sydney, Beijing, and at UC Berkeley. After obtaining his Master of Science in Information Systems and a Master of Business Administration (MBA), he worked for the VLBA-lab, a joint venture of the SAP AG and the University Magdeburg. He has taught at the universities in Berkeley, Magdeburg and at the University of Applied Sciences HTW Berlin. He volunteered in rural China and India, teaching math and computer science, as well as at Syria’s Wadi International University. In his spare time he enjoys photography.

National Institute of Informatics (NII), Tokyo, Japan / Prof. Akiko Aizawa

The National Institute of Informatics (国立情報学研究所 Kokuritsu Jōhōgaku Kenkyūjo?, NII) is a Japanese research institute created in April 2000 for the purpose of advancing the study of informatics. This institute is also devoted to creating a system to facilitate the spread of scientific information to the general public. The NII is the only comprehensive research institute in Japan in informatics. It also oversees and maintains a large, searchable information database on a variety of scientific and non-scientific topics called Webcat. [4]

Your supervisor at NII would be Prof. Akiko Aizawa (and Joeran Beel would be the co-supervisor). Akiko Aizawa is member of the Digital Content and Media Sciences Research Division at NII and works in the field of recommender systems, citation and content analysis as well as math retrieval. Akiko Aizawa has successfully supervised many students from the University of Konstanz and was also the supervisor of Joeran Beel and Bela Gipp during their time at NII.

Funding Options

Overview

In the following we list various options for funding your research. However, please note that if you are aware of some other funding options (maybe some research programs by your university or national research council), you are very welcome to discuss these options with us. And, of course, you can always apply for a non-paid internship or contact us for a research cooperation.

Funding Overview

Funding Overview (click to enlarge)

PhD Scholarship at Trinity College Dublin, Ireland (All Nationalities)

Joeran Beel will start working as Assistant Professor at Trinity College Dublin (Ireland) from December 1st, 2016. Part of the professorship is a budget for a PhD scholarship. The yearly compensation for a PhD student is 16,000€ plus tuition fees and some extra benefits (e.g. laptop). If you are interested in this position, please apply.

Begin: December 1st, 2016 or later
Deadline for Application: Open until a suitable candidate is found

Research Internship at Trinity College Dublin, Ireland (All Nationalities)

As long as the PhD scholarship has not been granted (see above), Joeran Beel may invite a visiting researcher for 3-12 months (Master or PhD students or Postdoc). The monthly compensation for an internship is 1,000€ for Master students and 1,333€ for PhD students and Postdocs. If you are interested in this position, please apply.

Begin: December 1st, 2016 or later
Deadline for Application: Open until a suitable candidate is found

Postdoctoral Marie Skłodowska-Curie Fellowship at Trinity College Dublin, Ireland  (All Nationalities)

The ADAPT Centre is a research centre that combines the world-class expertise of researchers at four universities (Trinity College Dublin, Dublin City University, University College Dublin and Dublin Institute of Technology) with that of its industry partners to produce ground-breaking digital content innovations. Recently, the ADAPT Centre – in cooperation with further research centres and universities – has secured funding for 71 Marie Skłodowska-Curie Postdoctoral Fellowship in Ireland as part of the EDGE COFUND action. These fellowships allow postdoctoral researchers to pursue their own research project over a period of two years at one of the participating research centres, under guidance of a supervisor and an industry partner.

If you are an experienced postdoctoral researcher, who would like to pursue a research project independently in Ireland, in cooperation with Joeran Beel as supervisor and an industry partner, have a look at http://edge-research.eu for more information.

Begin: April 2017
Deadline to discuss a project with Joeran Beel as supervisor: November 1, 2016
Deadline for application: December 1st, 2016

PhD position at Konstanz University, Germany (All Nationalities)

In Bela Gipp’s group are several available PhD positions for 3 years with an option for extension. Salary will be based on Germany’s TVL13 or TVL14 salary scheme for the public sector. There are also some PhD scholarships available with a compensation roughly between 1150 and 1600 Euros. If you are interested in this position, please apply.

Post-doctoral position at Konstanz University, Germany (All Nationalities)

Some of the PhD positions at Konstanz university (see above) might also be given to postdoctoral researchers, also on a TVL13 or TVL14 salary scheme for the public sector. The position would be for 3 years with an option for extension. If you are interested in this position, please apply.

DAAD R.I.S.E Weltweit (German Bachelor Students to Tokyo or Dublin)

The DAAD R.I.S.E Weltweit program allows German Bachelor students to conduct a research project abroad. Between November 1 and December 22 you can browse and apply for a project you would like to pursue. We have submitted such a project to DAAD and will provide further information here on November 1st, 2016.

DAAD FITweltweit (German Master and PhD  Students to Tokyo and Dublin)

DAAD FITweltweit for German Master and PhD students allows you to pursue a research project at the NII in Tokyo or the Trinity College Dublin. Applications are possible at any time. If you are interested in this position, please apply.

DAAD FITweltweit PostDoc (German Post-Doctoral Researchers to Tokyo)

The DAAD FITweltweit program for postdocs is certainly one of the most attractive fellowships at all. It allows you to pursue a research project of your choice at the NII in Tokyo with a highly competitive remuneration (at least 3700€ per month, tax free, and if you are married or have children you get some extra remuneration). The fellowship also includes an additional budget that allows you to pay a Master or PhD student to work for you. Both Prof. Bela Gipp and Prof. Joeran Beel successfully applied for this scholarship and we can support you with your application. Applications are possible at any time. If you are interested in this position, please apply.

DAAD P.R.I.M.E (German Post-Doctoral Researchers to Dublin)

DAAD P.R.I.M.E is a competitive fellowship that allows you conducting a 12 months project at the Trinity College Dublin, and then stay an additional 6 months at a German university. During the whole time you will be employed by the German host university and be paid according to the TVL13 salary scheme. The German host university could be Konstanz University or any other university you like. The program usually has a call for applications once per year. If you are interested in this position, please apply.

DAAD Postdoc Programm (German Post-Doctoral Researchers to Dublin)

The DAAD Postdoc programm (Kurzstipendium) allows German post-doctoral researchers to conduct a research project at Trinity College Dublin, for 3 to 6 months. There is also a similar program for 7 to 24 months but it is currently restructured. The compensation is around 2500€ per months. If you are interested in this position, please apply.

Student Jobs at Konstanz University, Germany (Local Students Only)

At Konstanz university, we usually have some open positions for local students who are interested in working 20 to 80 hours per months. If you are interested in this position, please apply.

NII Internship Call (Students of TCD and Konstanz University to Tokyo)

The NII in Toyko offers 3-6 months internships for Master and PhD students. Usually, there are two calls per year and only students from selected universities may apply (Trinity College Dublin and Konstanz University belong to these selected universities). The monthly compensation is 171,000 Japanese YEN) and you will be supervised by either Joeran Beel or Bela Gipp and a local supervisor at NII (probably Prof. Aikiko Aizawa). If you are interested in this position, please apply.

Deadline: November 7 (you need to contact us well in advance to discuss the research project).

NII Short-Term Internship (All Nationalities)

Joeran Beel will be at NII until November 28, and until then we can invite one Master or PhD student to work with us at NII for 4-8 weeks. The remuneration is 171,000 Yen per month. If you are interested, contact us as soon as possible. If you are interested in this position, please apply.

Application & Contact

If you are interested in any of the positions, then:

  1. Find out, which funding options are suitable for you
  2. Send the following information to joeran dot beel at adaptcentre dot ie
    • The standard information such as current academic status, which position do you apply for, when do you want to start, …
    • Brief letter of motivation (explain in which project you want to participate and why)
    • CV
    • A list that specifies for each of the above listed requirements, how good your knowledge is (e.g. JAVA: Excellent; Linux: Basic; Databases: Good; …)
    • If you have: List of publications and software projects you contributed to; link to Google Scholar profile and GitHub page or similar.
  3. We will get back to you with further information as soon as possible.

If you have any questions, please send an email to joeran dot beel at adaptcentre dot ie.

 

 

[1] Paragraph copied from https://en.wikipedia.org/wiki/Trinity_College,_Dublin

[2] Paragraph copied from http://wikitravel.org/en/Dublin

[3] Paragraph copied from https://en.wikipedia.org/wiki/University_of_Konstanz

[4] Paragraph copied from https://en.wikipedia.org/wiki/National_Institute_of_Informatics

Students & PostDocs: We have open positions in Tokyo, Copenhagen, and Konstanz (2-24 months)


Update 2016-01-12: The salary in Tokyo would be around 1.600 US$ per month, not 1.400.


2015 has been a rather quiet year for Docear, but 2016 will be different. We have lots of ideas for new projects, and even better –  we have funding to pay at least 1 Master or PhD student, to help us implementing the ideas. There is also a good chance that we get more funding, maybe also for Bachelor students and postdoctoral researchers. The positions will be located in Tokyo, Copenhagen or Konstanz (Germany).

In the following, there is a list of potential projects. If you are interested, please apply, and if you have own ideas, do not hesitate to discuss them with us. What exactly you will be doing within each of the projects, depends on your preferences, skills, how much time you have, and how many other students will participate in the project. Either way, each project will be highly suitable for both, applying and enhancing your software development skills, and conducting research, e.g. as part of a Master or PhD thesis.

With respect to software development, you will be working with JAVA and state-of-the art recommendation and search frameworks (Apache Mahout, Lucene, Solr, and/or LensKit), data formats (XML, JSON and/or BibTeX), SQL and noSQL databases (PostgreSQL, MySQL, and/or Neo4j), PDF processing tools (jPod), distributed file storage and data processing (Hadoop or Spark), and you will get a deep dive into RESTful Webservices (Java Jersey), and recommendation technologies (e.g. content-based filtering and collaborative filtering). And the best: The results of your work will help tens of thousands of researchers around the globe to do better research!

With respect to research, there are options for all levels of researchers (Bachelor, Master, PhD, Postdoc), in various disciplines: Recommender systems, recommender system evaluation, digital libraries, web crawling, citation and network analysis, web services, databases, scalability,  bibliometrics, security and privacy, interoperability, software quality, statistics, … So, if you want to do a Bachelor, Master, or PhD thesis, or just gain some research experience, there will be plenty of opportunities.

The Projects

Mr. DLib, the Machine-Readable Digital Library

Background

mrdliblogoMr. DLib is a digital library with around 2 million academic articles, crawled from the Web, and all articles are accessible through a RESTful API. Mr. DLib does not aim to be used by “end-users”. Instead, Mr. DLib offers services for operators of other academic services and software tools such as reference managers and digital libraries. Through Mr. DLib’s API, these academic services and tools can search for academic articles, request specific articles, or send PDF files to the API and receive metadata for the PDF files. Mr. DLib will be the foundation for many of our future projects such as the recommendations as a service (see below) and PDF metadata retrieval for Docear and other reference managers.

Software Development Goals

In the next year, we want to significantly extend the functionality of Mr. DLib, and grow the document corpus. This includes the enhancement of the Web Crawler and Google Scholar Parser, PDF processing (jPod), citation and PDF metadata parser (parsCit), data storage (MySQL and Neo4j), search functionality (Apache Lucene & Solr), data delivery (Java Jersey), rights management, and performance (Hadoop or Spark?). We also planto re-design wide parts of the architecture (currently based on Apache Lucene; Java Jersey; Hibernate; MySQL) and data model (XML; JSON; MySQL). Hence, plenty of work for a capable software developer :-).

Research Opportunities

Mr. DLib offers various research opportunities. You could research how to extract titles and citations most effectively from PDF files; how to build a scalable Web Service and/or Web Crawler; how to measure code quality; how to request large amounts of data from Google Scholar without being detected as robot; you could compare the performance of Hadoop and Spark; or do other research in the field of digital libraries and open access …

Required Skills

Must

  • JAVA
  • Databases, preferably MySQL or PostgreSQL and/or Neo4j

Desired

  • Server administration (Linux & Tomcat)

Optional

  • Any of the fields, technologies and data formats listed in the project description (Web Crawling, Apache Lucene, Jersey, Hibernate, Hadoop, XML, Restful Web Services, …)

Research-Paper Recommendations as a Service

Background

There are countless academic services and software tools such as reference managers, academic search engines, (digital) libraries and (electronic) journals. However, only few of them offer research-­paper recommender systems to their users, although recommender systems could provide lots of additional value to their users. Imagine, your reference manager would provide you regularly with a list of newly published papers that are relevant for your work; or your professor recommends a book to you, the book is not available any more in your university’s library, but the library’s website recommends alternative books to you; …

One reason why so many academic services and tools do not have research-­paper recommender systems is that developing such systems requires a lot of knowledge and effort, and the libraries do not have the knowledge or resources.

Software Development Goals

Our goal is to develop a research-­paper recommender system “as a service” that can be used by any academic service or software tool, without much knowledge about recommender systems and without a lot of resources being required. This recommender system will be build on top of Mr. DLib and it will allow third party tools to easily get “recommendations as a service”. For instance, a reference manager could send a users’ personal library to Mr. DLib and Mr. DLib returns a list of research-­paper recommendations. Similarly, a digital library would send the metadata (title, ISBN, ….) of a certain article or book to Mr. DLib, and would receive a list of related articles and books. The entire communication between Mr. DLib and the client applications will be based on a RESTful Web Service and standard data formats such as XML and JSON. The first pilot partners to use Mr. DLib’s recommender system are Docear and JabRef (if you are the operator of an academic service, and want to be a pilot partner, please contact us). The task of developing a “Recommender System as Service” is truly challenging and multifaceted since the users’ data from various sources (Docear, JabRef, …) needs to be transferred to Mr. DLib, stored and processed, and recommendations need to be calculated and returned. This process requires many aspects to be considered such as scalability, security and privacy, interoperability, extendability, and, of course, calculating world-­class recommendations that researchers love.

Research Opportunities

The project offers research opportunities in the field of interoperability, data-format standards in the domain of digital libraries, data processing, and scalability. More importantly, this project lays the foundation for all the following (research) projects (see below).

Required Skills:

Must

  • JAVA
  • Databases such as MySQL
  • Linux (Basic)

Desired

  • JavaScript
  • Knowledge of recommendation concepts

Optional

  • Any of the fields, technologies and data formats listed in the project description (Jersey, Mahout, LensKit, …)

Research-Paper Recommendations: A Novel Approach

Research & Software Development Goals

There are around 90 different approaches to give research-­paper recommendations. Your goal will be to develop a novel research­-paper recommendation approach that is more effective than those ones currently being available. To achieve the goal, you will be integrating some of the existing approaches into Mr. DLib’s “Recommender System as a Service”, and either develop a completely new approach, or enhance the existing ones. We have many ideas how existing approaches could be enhanced. For instance, the ranking process could be improved by using Scientometrics (e.g. citation counts of papers, h-­index, …), and there are many options more that we are happy to share with you in a personal discussion.

The project is highly attractive in two ways. First, you will be heavily working with standard recommendation frameworks that are used in all domains (news, movies, …). Hence, you will gain valuable skills for the job market. Second, you will get a deep insight to research­-paper recommender systems, which is an attractive field to do further research e.g. as part of a PhD.

Required Skills

Must

  • JAVA

Desired

  • Knowledge about recommendation concepts and recommender-­systems evaluation
  • Data Analysis Tools (Excel, SPSS, R, …)

Optional

  • Recommendation Frameworks (Mahout & LensKit)
  • Web Services
  • Databases

Recommender-Systems Evaluation & Reproducibility

Background

The reproducibility of experimental results is the “fundamental assumption” in science, and the “cornerstone” for drawing meaningful conclusions about the generalizability of ideas. Recently, we found that reproducibility is rarely given in the recommender­systems community, particularly in the research­paper recommendersystem community. In a review of 89 research­-paper recommender­-systems evaluations, we identified several cases in which only slight variations in the initial set­up of the evaluation or approaches led to surprisingly different results.

Software Development & Research Goals

We want to find out, how recommender-systems should be ideally implemented and evaluated to ensure reproducible results. To achieve the goal,

  • you will implement a number of research-­paper recommendation approaches (or simply use some existing ones),
  • then these approaches will be used to give recommendations to the users of the applications that use Mr. DLib’s recommender system (Docear, JabRef, and maybe others)
  • you will analyze how the recommendation approaches perform in the different scenarios and you try to identify the factors that affect the recommendation effectiveness. This includes making controlled changes to the algorithms and applications that display the recommendations.

If you are interested in this project, please have a look at our paper that will soon be published in the journal “User Modeling and User­Adapted Interaction (UMAI)”. The paper gives you a detailed overview on the topic of reproduciblity. Your task would be to continue what we started for the paper.

Required Skills

Must

  • Knowledge in statistics
  • Data Analysis Tools (Excel, SPSS, or R, …)

Desired

  • Knowledge about recommendation concepts and recommender-­systems evaluation
  • Basic programming and database knowledge

Optional

  • XML
  • Web Services

Locations

Tokyo

In Tokyo, we cooperate with Prof. Akiko Aizawa at the National Institute of Informatics, which is one of Japan’s most respected research institutes in the field of information science. Our co-founder Prof. Bela Gipp, had spent his postdoctoral time at the NII, and I will also be at the NII from April 2016 onward.

Tokyo is vast: it’s best thought of not as a single city, but a constellation of cities that have grown together. Tokyo’s districts vary wildly by character, from the electronic blare of Akihabara to the Imperial gardens and shrines of Chiyoda, from the hyperactive youth culture Mecca of Shibuya to the pottery shops and temple markets of Asakusa. If you don’t like what you see, hop on the train and head to the next station, and you will find something entirely different.

The sheer size and frenetic pace of Tokyo can intimidate the first-time visitor. Much of the city is a jungle of concrete and wires, with a mass of neon and blaring loudspeakers. At rush hour, crowds jostle in packed trains and masses of humanity sweep through enormous and bewilderingly complex stations. Don’t get too hung up on ticking tourist sights off your list: for most visitors, the biggest part of the Tokyo experience is just wandering around at random and absorbing the vibe, poking your head into shops selling weird and wonderful things, sampling restaurants where you can’t recognize a single thing on the menu (or on your plate), and finding unexpected oases of calm in the tranquil grounds of a neighborhood Shinto shrine. It’s all perfectly safe, and the locals will go to sometimes extraordinary lengths to help you if you just ask.
Source: Wikitravel

Copenhagen

In Copenhagen, we cooperate with Prof. Alesia Zuccala at the Royal School of Library and Information Science (RSLIS). The RSLIS has a long tradition of research in the field of (digital) libraries, bibliometrics, and information science, and hence represents an ideal partner for developing a machine-readable digital library, and research-paper recommender systems.

Copenhagen is the capital of Denmark and what a million Danes call home. This “friendly old girl of a town” is big enough to be a metropolis with shopping, culture and nightlife par excellence, yet still small enough to be intimate, safe and easy to navigate. Overlooking the Øresund strait with Sweden just minutes away, it is a cultural and geographic link between mainland Europe and Scandinavia. This is where old fairy tales blend with flashy new architecture and world-class design; where warm jazz mixes with cold electronica from Copenhagen’s basements. You’ll feel you’ve seen it all in a day, but could keep on discovering more for months.
Source: Wikitravel

Konstanz (Germany)

The university of Konstanz is home of our team members Prof. Bela Gipp, Corinna Breitinger, and Norman Meuschke, and it is one of only eleven “Excellence” university in Germany. The Information Science Group, chaired by Bela Gipp, is doing research in the field of recommender systems, plagiarism detection, and document analysis, and provides an excellent environment for researchers.

Konstanz has traces of civilization dating from the stone age and was settled by the Romans in about 50 CE. Konstanz was an important trade centre and a spiritual centre. At the council of Konstanz in 1414-1418, a papal election was held, ending the papal schism. Konstanz attempted to join the Swiss Confederacy in about 1460, but was voted down. Due to its proximity to Switzerland, Konstanz was not bombed during world war II and its historic old town remains intact. It is a historic city with a charming old town, and could be called the jewel of the region.
Source: Wikitravel

Funding

Master & PhD students (for Tokyo)

The NII offers research internships for Master and PhD students, and one of these internships can be given to a Master of PhD student for supporting our projects. The compensation will be around 1.400 1.600 US$. Start of the internship would be between April and August for ­2-12 months. To receive the scholarship, you apply and the Docear team pre­selects one or two candidates. Then, the candidates (supported by the Docear team) write a project proposal that is reviewed by the NII. If the NII approves the proposal, you can book your flight :­).

German Master & PhD students (for Tokyo or Copenhagen)

For German Master and PhD students in the field of computer science, the DAAD offers the “FIT Weltweit scholarships for 1-­6 months. You could apply for a scholarship for both a research stay at the NII in Tokyo or the RSLIS in Copenhagen. The scholarship would be around 850 Euros (Master students) or 1700 Euros (PhD students) per month, plus travel expenses. If you are interested in applying for a scholarship, let us know, and we will support you in writing the application.

German Postdocs (for Tokyo)

The German Academic Exchange Service (DAAD) has a special scholarship program for postdoctoral researchers. The program DAAD Fit Weltweit for PostDocs provides excellent conditions for staying at the NII in Tokyio for up 3-­24 months. During your scholarship, you can pursue a research project that you agreed on with the NII, and get a compensation of around 3.400 Euros per month, plus some additional benefits. If you are interested in such a scholarship, contact us, we will help you creating a project proposal and applying for the scholarship. Of course, we cannot give any guarantees for success. Eventually, the NII and DAAD reviewers have to accept the proposal. Contact us for more information.

German Bachelor & Master Students (for Konstanz)

If you are a student at the university of Konstanz, we might employ you as a student worker (Hiwi), or you might do any of the projects as a Bachelor, Master, or PhD project. Even if you are at another German university, we might be able to employ you as a student worker, though we cannot yet promise. Contact us, if you are interested in any of the projects.

All others

There are usually many options for students to spend some time abroad and to do some research ­projects. Ask your professors or study advisers if they know of any funding opportunities. Look at the websites of your national research councils or similar organizations (in Germany that would be the DFG or DAAD). If you find a suitable program and need our help to apply for that program, let us know.

Even if you are not applicable for any funding opportunities, please, send us your application. If there should be new funding options in the future, we will contact you. In addition, feel free to join the Docear development team as a volunteer. You won’t get paid but you would work on an amazing project that is ideal for learning new technologies and doing great research.

Apply & Contact Details


To apply, send your cover letter (including 1-2 pages motivation) and CV to me, i.e. Joeran Beel beel@docear.org. Please explain in detail, which of the project(s) you are interested in, when you could start your internship, how long you could stay, where you would want to do the internship, and which funding options apply to you. We will then get back to you as soon as possible with further information on how to proceed. If you have any questions, do not hesitate to send me an email beel@docear.org.

New paper for UMAP’15: Exploring the Potential of User Modeling based on Mind Maps

One reason why we originally started the development of Docear was our interest in how people are creating mind-maps and how the information contained in mind-maps could be used for building recommender systems and other user-modeling applications. As a result of our we developed Docear’s research-paper recommender system, and if you are interested in how the recommender system works, you might be interested in our new research article which was just accepted at the 23rd Conference on User Modelling, Adaptation, and Personalization” (UMAP’15). The article is titled “Exploring the Potential of User Modeling based on Mind Maps”, the pre-print is available here, and this is the abstract:

Mind maps have not received much attention in the user modeling and recommender system community, although mind maps contain rich information that could be valuable for user-modeling and recommender systems. In this paper, we explored the effectiveness of standard user-modeling approaches applied to mind maps. We develop novel user modeling approaches that consider the unique characteristics of mind maps. The approaches are applied and evaluated using our mind-mapping and reference-management software Docear. Docear displayed 430,893 research paper recommendations, based on 4,700 user mind maps, from March 2013 to August 2014. The evaluation shows that standard user modeling approaches are reasonably effective when applied to mind maps, with click-through rates (CTR) between 1.16% and 3.92%. However, when adjusting user modeling to the unique characteristics of mind maps, a higher CTR of 7.20% could be achieved. A user study confirmed the high effectiveness of the mind map specific approach with an average rating of 3.23 (out of 5), compared to a rating of 2.53 for the best baseline. Our research shows that mind-map specific user modeling has a high potential, and we hope that our results initiate a discussion that encourages researchers to do research in this field and developers to integrate recommender systems to their mind-mapping tools.

Download Preprint

The conference takes place 29th June to 3rd July 2015 in Dublin. Let us know if you are also attending.

How to proceed with the development of the Docear4LibreOffice add-on ?

More than half a year ago, we started a call for donation to pay a freelancer who wanted to develop an add-on for LibreOffice and OpenOffice, comparable to Docear4Word. Originally, we estimated that it would take about 2 months before the work was completed, or at least a decent demo version was ready to released. Well, that estimate wasn’t quite precise – the developer hasn’t finished even an alpha version yet. In addition, we are still missing a significant amount of donations to fully pay the developer ($1,000 are missing).

The question arises, how to proceed? We see the following options:

1. Just wait 

The freelancer is still working on the add-on. So, most likely he will finish the add-on some day – maybe in 2 months, maybe in 6 months, maybe in a year. However, I have to point out that my satisfaction with the current progress and outcomes are not overwhelming. Personally, I have some doubts that the final add-on will meet the quality expectations I have, and that probably most Docear users have.  However, I suggest you get an idea of the add-on yourself. The freelancer sent me a demo version that you can try out. To do so, download the add-on, store it on your hard drive, and open the downloaded file with LibreOffice or OpenOffice. This should open an installation dialog, and you need to confirm all messages in the dialog. After the installation, you should restart Libre/OpenOffice. If you are using OpenOffice, you will have a Docear entry in the menu and in the tool bar (see screenshot below). If you are using LibreOffice, you will only have an entry in the menu.

Read more…

Update of the Google Scholar PDF Metadata Retrieval Library

In the past few weeks, several users reported that Docear could not retrieve metadata more from Google Scholar any more. It took as a while, but now we found and fixed the problem – hopefully. We need your help to test if the fixed version really works well for all of you (it’s very hard to test for us because Google Scholar often did not block us while it did block some users).

Here is how you can install the new Google Scholar PDF Metadata Retrieval Library:

  1. Close Docear if it’s currently running
  2. Download the Google Scholar library.
  3. Replace the existing library with the new one. You find your old library in a path like C:\Program Files (x86)\Docear\plugins\org.docear.plugin.bibtex\lib\
  4. Start Docear
  5. Request metadata 🙂

Please let us know if the new library works, in particular if you had problems with the old one. Please note that even with the new library you might enter captchas occasionally. However, with the old library, captchas were occurring again and again for some users. This problem should be fixed (after entering the captcha correctly, you should be able to retrieve metadata at least for a few dozens of PDFs).

New Pre-print: The Architecture and Datasets of Docear’s Research Paper Recommender System

Our paper “The Architecture and Datasets of Docear’s Research Paper Recommender System” was accepted at the 3rd International Workshop on Mining Scientific Publications (WOSP 2014), which is held in conjunction with the ACM/IEEE Joint Conference on Digital Libraries (JCDL 2014). This means, we will be in London from September 9 until September 13 to present our paper. If you are interested in research paper recommender systems, feel free to read the pre-print. If you find any errors, let us know before August 25 – that’s the date when we have to submit the camera ready version.

Here is the abstract:

In the past few years, we have developed a research paper recommender system for our reference management software Docear. In this paper, we introduce the architecture of the recommender system and four datasets. The architecture comprises of multiple components, e.g. for crawling PDFs, generating user models, and calculating content-based recommendations. It supports researchers and developers in building their own research paper recommender systems, and is, to the best of our knowledge, the most comprehensive architecture that has been released in this field. The four datasets contain metadata of 9.4 million academic articles, including 1.8 million articles freely available on the Web; the articles’ citation network; anonymized information on 8,059 Docear users; information about the users’ 52,202 mind-maps and personal libraries; and details on the 308,146 recommendations that the recommender system delivered. The datasets are a unique source of information to enable, for instance, research on collaborative filtering, content-based filtering, and the use of reference management and mind-mapping software.

Full-text (PDF)

Datasets (available from mid of September)

Docear 1.1.1 Beta with Academic Search Feature

As you may know, Docear features a recommender system for academic literature. To find out which papers you might be interested in, it parses your mind maps and compares them to our digital library with currently about 1.8 million academic articles. While this is helpful and might point you to papers relevant for your general research goals, you will sometimes have to find information on a specific topic and hence search directly.

Based on our knowledge about recommender systems and some user requests, we decided to implement a direct search feature on our digital library. I am very grateful to Keystone, who supported me in visiting Dr. Georgia Kapitsaki at the University of Cyprus (UCY) in Nicosia for a full month to work on this idea. Dr. Kapitsaki’s has already supported us in our work on Docear’s recommender system in July 2013. Her knowledge about the inner mechanics and her ideas on the the search engine were essential for the implementation and the research part of the project.

How to use it

You can access the search feature from Docear’s ribbon bar (“Search and Filter > Documents > Online search”) or by double-clicking the “Online search” entry in Docear’s workspace panel. Since both the recommender system and the personalized search engine make use of your mind maps. you need to enable the recommendation service in Docear.

Screenshot from 2014-07-07 15:19:39

After opening the search page, you will see

  • a text box for your search query,
  • a “Search” button, and
  • several buttons below the text box reflecting search terms you might be interested in. If Docear does not have enough data to decide about your interests, this part remains empty.

Docear-online-search-interface

Read more…

Docear’s new workspace and workflow concept: We need your feedback!

In the past years, Docear evolved to a powerful software for managing literature and references. However, we have to admit that Docear is still not as user friendly as we would like it to be. This is mainly caused by the workspace concept which is not very intuitive. We are aware of this problem and we would like to fix it. Therefore, we spent the last weeks with a lot of brainstorming and discussions, and we came up with a new concept. We believe it to be more intuitive, and more similar to the concepts you know from other reference managers. In the following, we would like to introduce our ideas for the new workspace concept and some other changes and we ask you for your feedback. Please let us know in the comments if you like our ideas, and how we could make the concept even better.

This is how the new workspace panel would look like after you freshly installed Docear and sorted a few PDFs including annotations (click the image to enlarge it).

There are four main categories in the workspace panel (left).

Read more…

Research Paper Recommender Systems: A Literature Survey (Preprint)

As some of you might know, I am a PhD student and the focus of my research lies on research paper recommender systems. Now, I am about to finish an extensive literature review of more than 200 research articles on research paper recommender systems. My colleagues and I summarized the findings in this 43 page preprint. The preprint is in an early stage, and we need to double check some numbers, improve grammar etc. but we would like to share the article anyway. If you are interested in the topic of research paper recommender system, it hopefully will give you a good overview of that field. The review is also quite critical and should give you some good ideas about the current problems and interesting directions for further research.

If you read the preprint, and find any errors, or if you have suggestions how to improve the survey, please let us know and send us an email. If you would be interested in proof-reading the article, let us know, and we will send you the MS-Word document

Abstract. Between 1998 and 2013, more than 300 researchers published more than 200 articles in the field of research paper recommender systems. We reviewed these articles and found that content based filtering was the predominantly applied recommendation concept (53%). Collaborative filtering was only applied by 12% of the reviewed approaches, and graph based recommendations by 7%. Other recommendation concepts included stereotyping, item-centric recommendations and hybrid recommendations. The content based filtering approaches mainly utilized papers that the users had authored, tagged, browsed, or downloaded. TF-IDF was the most applied weighting scheme. Stop-words were removed only by 31% of the approaches, stemming was applied by 24%. Aside from simple terms, also n-grams, topics, and citations were utilized. Our review revealed some serious limitations of the current research. First, it remains unclear which recommendation concepts and techniques are most promising. Different papers reported different results on, for instance, the performance of content based and collaborative filtering. Sometimes content based filtering performed better than collaborative filtering and sometimes it was exactly the opposite. We identified three potential reasons for the ambiguity of the results. First, many of the evaluations were inadequate, for instance, with strongly pruned datasets, few participants in user studies, and no appropriate baselines. Second, most authors provided sparse information on their algorithms, which makes it difficult to re-implement the approaches or to analyze why evaluations might have provided different results. Third, we speculated that there is no simple answer to finding the most promising approaches and minor variations in datasets, algorithms, or user population inevitable lead to strong variations in the performance of the approaches. A second limitation related to the fact that many authors neglected factors beyond accuracy, for example overall user satisfaction and satisfaction of developers. In addition, the user modeling process was widely neglected. 79% of the approaches let their users provide some keywords, text snippets or a single paper as input, and did not infer information automatically. Only for 11% of the approaches, information on runtime was provided. Finally, it seems that much of the research was conducted in the ivory tower. Barely any of the research had an impact on the research paper recommender systems in practice, which mostly use very simple recommendation approaches. We also identified a lack of authorities and persistence: 67% of the authors authored only a single paper, and there was barely any cooperation among different co-author groups. We conclude that several actions need to be taken to improve the situation. Among others, a common evaluation framework is needed, a discussion about which information to provide in research papers, a stronger focus on non-accuracy aspects and user modeling, a platform for researchers to exchange information, and an open-source framework that bundles the available recommendation approaches.

New Paper: Utilizing Mind-Maps for Information Retrieval and User Modelling

We recently submitted a paper to UMAP (The Conference on User Modelling, Adaptation, and Personalization). The paper was about how mind-maps could be utilized by information retrieval applications. The paper got accepted, which means we will be in Aalborg, Denmark from July 7 until July 11 to present the paper. If you are a researcher in the field of information retrieval, or user modelling, or mind-mapping, you might be interested in the pre-print. Btw. if you find any errors in it, we would highly appreciate if you told us (ideally today). Similarly, if you are interested in a research partnership, or if you are also at UMAP 2014 and would like to discuss our research, please contact us.

Abstract. Mind-maps have been widely neglected by the information retrieval (IR) community. However, there are an estimated two million active mind-map users, 300,000 public mind-maps, and 5 million new, non-public, mind-maps created every year – a potentially rich source for information retrieval applications. In this paper, we present eight ideas on how mind-maps could be utilized by IR applications. For instance, mind-maps could be utilized to generate user models for recommender systems or expert search, or to calculate relatedness of web-pages that are linked in mind-maps. We evaluated the feasibility of the eight ideas, based on estimates of the number of available mind-maps, an analysis of the content of mind-maps, and an evaluation of the users’ acceptance of the ideas. Based on this information, we concluded that user modelling is the most promising application with respect to mind-maps. A user modelling prototype, i.e. a recommender system for the users of our mind-mapping software Docear, was implemented, and its effectiveness evaluated. Depending on the applied user modelling approaches, the effectiveness, i.e. click-through rate on recommendations, varied between 0.28% and 6.24%. This indicates that mind-map based user modelling is promising, but not trivial, and that further research is required to increase effectiveness.

View pre-print

 

Wanted: Participants for a User Study about Docear’s Recommender System

We kindly ask you to participate in a brief study about Docear’s recommender system. Your participation will help us to improve the recommender system, and to secure long-term funding for the development of Docear in general! If you are willing to invest 15 minutes of your time, then please continue reading.

Participate in the Study

  1. Start Docear
  2. Click on the “Show Recommendations” button.
  3. Click on all recommendations, so they open in your web-browser. Click on them even if you know a paper already, or if a paper was recommended previously.
  4. For each recommended paper, please read at least the abstract. You may also read the entire paper if you like, or at least skim through it.
  5. Rate the recommendations: The better the current recommendations are, the more stars you should give. Please note that:
    • Ratings should only reflect the relevance of the current set of recommendations. Do not rate recommendations based on the quality of previously received recommendations.
    • Please do not rate recommendations poorly only because they were shown previously. When you receive recommendations that were shown previously, you should give the same rating as previously.
    • Please do not rate recommendations poorly because the link to the PDF was dead, and you could not read the PDF. Dead links do not depend on the recommender system, and you should just ignore them. In other words: Please rate only the quality of those recommendations that you could actually read. If none of the recommendations you clicked could be read, just give no rating at all.
  6. The user study runs until July 15th. We would highly appreciate if you could receive and rate recommendations during that period a couple of times (click the green refresh icon to receive new recommendations). The more recommendations you rate, the better for our research. However, even if you rate recommendations only once, you will help us a lot.

Very important: Please let us know your Docear username so we know you participated in the study – send it to info@docear.org. You can find your user name in the bottom-left corner of the status bar (see picture).

In addition, we would very much appreciate if you provide us with the following information:

Age:
Gender:
Nationality:
Status: (e.g. Professor, Phd Student, Master Student, …)
Field of  Research:
Since when are you using Docear approximately:

Docear4Word 1.30: Faster and more robust BibTeX key handling

Docear4Word 1.30 is available for download. We improved the error handling, the speed, and the robustness for special characters in BibTeX keys. Here are all changes in detail

  • A database parsing error during Refresh now displays message with line and column information.
  • More robustness for special characters in BibTeX keys:
    • Changing to the official spec of allowed characters was causing too many problems for existing libraries.
    • < > & % { } ” plus whitespace characters are now disallowed but any other char is valid
    • An apostrophe ‘ is allowed in the database but is removed from the BibTeX key before use rather than treat it as invalid.
  • Some optimizations have been made so that moving through a large document with many fields should be faster
  • When the Reference tab is selected (and the Docear4Word buttons need to be updated), Docear4Word now makes the minimum number of checks necessary. It is still slower towards the end of a large document than the beginning but much improved over previous versions.
  • When the Reference tab is not selected, Docear4Word makes no checks and so cursor movement is at full speed regardless of position in the document.

Docear4LibreOffice / Docear4OpenOffice: Call for Donation (2500$)

–> Read here for the latest update <–

One of our users’ most requested feature is an add-on for LibreOffice and OpenOffice, similar to Docear4Word, which allows adding formatted references and bibliographies in Microsoft Word based on Docear’s BibTeX files. Unfortunately, we have no skills in developing add-ons for Libre or OpenOffice, which is why we were looking for a freelancer to help us. Now, finally, we found one. The freelancer is offering to develop a pendant to Docear4Word that works with LibreOffice and OpenOffice. This means, you will be able to select a reference from Docears’ BibTeX database, and the add-on will insert the in-text citation and the bibliography in your Libre/OpenOffice document. Analog to Docear4Word, you will be able to choose from more than 2,000 citation styles to format your references.

However, the freelancer is not developing the add-on for free. He asks for 2500 US$ (~1,900€), which we believe to be a fair price. Therefore, we kindly ask you to donate, so we can pay the freelancer to develop a Docear4Libre/OpenOffice. Of course, the add-on will be open-source, reading not only Docear’s BibTeX files but also BibTeX files of other BibTeX based reference managers. The freelancer already developed a simple proof-of concept (see screenshot), which uses citeproc-java to add BibTeX based references. As such, we have no doubts that the freelancer will be able to deliver the promised add-on — if we can collect enough money.

The freelancer’s is already working on the add-on and his goal is to finish it in the next two months or so. However, as long as we cannot pay him, he will not release the add-on, even if he has finished his work (and if he learns that there are no donations coming, he might decide to stop his work at any time). Therefore, if you want a Docear4Libre/OpenOffice, please donate now! Donate 1$, 5$, 10$, 50$ or 500$ — any contribution matters, and the sooner we have all the money, the sooner you can manage your BibTeX references in LibreOffice and OpenOffice.

Donate via PayPal, or, to save PayPal fees, make a SEPA bank transfer to Docear, IBAN DE18700222000020015578, BIC FDDODEMMXXX. SEPA bank transfers are free of charge within the European Union.

 

 

 

 

 


We will keep you posted on the amount of donations, and any important news.
Read more…