Research Paper Recommender Systems: A Literature Survey (Preprint)

As some of you might know, I am a PhD student and the focus of my research lies on research paper recommender systems. Now, I am about to finish an extensive literature review of more than 200 research articles on research paper recommender systems. My colleagues and I summarized the findings in this 43 page preprint. The preprint is in an early stage, and we need to double check some numbers, improve grammar etc. but we would like to share the article anyway. If you are interested in the topic of research paper recommender system, it hopefully will give you a good overview of that field. The review is also quite critical and should give you some good ideas about the current problems and interesting directions for further research.

If you read the preprint, and find any errors, or if you have suggestions how to improve the survey, please let us know and send us an email. If you would be interested in proof-reading the article, let us know, and we will send you the MS-Word document

Abstract. Between 1998 and 2013, more than 300 researchers published more than 200 articles in the field of research paper recommender systems. We reviewed these articles and found that content based filtering was the predominantly applied recommendation concept (53%). Collaborative filtering was only applied by 12% of the reviewed approaches, and graph based recommendations by 7%. Other recommendation concepts included stereotyping, item-centric recommendations and hybrid recommendations. The content based filtering approaches mainly utilized papers that the users had authored, tagged, browsed, or downloaded. TF-IDF was the most applied weighting scheme. Stop-words were removed only by 31% of the approaches, stemming was applied by 24%. Aside from simple terms, also n-grams, topics, and citations were utilized. Our review revealed some serious limitations of the current research. First, it remains unclear which recommendation concepts and techniques are most promising. Different papers reported different results on, for instance, the performance of content based and collaborative filtering. Sometimes content based filtering performed better than collaborative filtering and sometimes it was exactly the opposite. We identified three potential reasons for the ambiguity of the results. First, many of the evaluations were inadequate, for instance, with strongly pruned datasets, few participants in user studies, and no appropriate baselines. Second, most authors provided sparse information on their algorithms, which makes it difficult to re-implement the approaches or to analyze why evaluations might have provided different results. Third, we speculated that there is no simple answer to finding the most promising approaches and minor variations in datasets, algorithms, or user population inevitable lead to strong variations in the performance of the approaches. A second limitation related to the fact that many authors neglected factors beyond accuracy, for example overall user satisfaction and satisfaction of developers. In addition, the user modeling process was widely neglected. 79% of the approaches let their users provide some keywords, text snippets or a single paper as input, and did not infer information automatically. Only for 11% of the approaches, information on runtime was provided. Finally, it seems that much of the research was conducted in the ivory tower. Barely any of the research had an impact on the research paper recommender systems in practice, which mostly use very simple recommendation approaches. We also identified a lack of authorities and persistence: 67% of the authors authored only a single paper, and there was barely any cooperation among different co-author groups. We conclude that several actions need to be taken to improve the situation. Among others, a common evaluation framework is needed, a discussion about which information to provide in research papers, a stronger focus on non-accuracy aspects and user modeling, a platform for researchers to exchange information, and an open-source framework that bundles the available recommendation approaches.

New Paper: Utilizing Mind-Maps for Information Retrieval and User Modelling

We recently submitted a paper to UMAP (The Conference on User Modelling, Adaptation, and Personalization). The paper was about how mind-maps could be utilized by information retrieval applications. The paper got accepted, which means we will be in Aalborg, Denmark from July 7 until July 11 to present the paper. If you are a researcher in the field of information retrieval, or user modelling, or mind-mapping, you might be interested in the pre-print. Btw. if you find any errors in it, we would highly appreciate if you told us (ideally today). Similarly, if you are interested in a research partnership, or if you are also at UMAP 2014 and would like to discuss our research, please contact us.

Abstract. Mind-maps have been widely neglected by the information retrieval (IR) community. However, there are an estimated two million active mind-map users, 300,000 public mind-maps, and 5 million new, non-public, mind-maps created every year – a potentially rich source for information retrieval applications. In this paper, we present eight ideas on how mind-maps could be utilized by IR applications. For instance, mind-maps could be utilized to generate user models for recommender systems or expert search, or to calculate relatedness of web-pages that are linked in mind-maps. We evaluated the feasibility of the eight ideas, based on estimates of the number of available mind-maps, an analysis of the content of mind-maps, and an evaluation of the users’ acceptance of the ideas. Based on this information, we concluded that user modelling is the most promising application with respect to mind-maps. A user modelling prototype, i.e. a recommender system for the users of our mind-mapping software Docear, was implemented, and its effectiveness evaluated. Depending on the applied user modelling approaches, the effectiveness, i.e. click-through rate on recommendations, varied between 0.28% and 6.24%. This indicates that mind-map based user modelling is promising, but not trivial, and that further research is required to increase effectiveness.

View pre-print

 

Wanted: Participants for a User Study about Docear’s Recommender System

A while ago, we published a paper about evaluating recommender systems, titled “A Comparative Analysis of Offline and Online Evaluations and Discussion of Research Paper Recommender System Evaluation”. We want to do a similar analysis but based on a user study. This means, we need Docear users that are willing to receive some recommendations, read the recommended documents (e.g. the abstracts, or ideally the entire article), and then tell us how good the recommendations are.

If you are interested in participating, which we would really, really appreciate, then please send an email to info@docear.org and you will receive further information. The study will take maybe 15 minutes of your time. Your participation will help to improve Docear’s recommender system, and it will help to finish our current research paper about evaluating recommender systems, which is important because the more and the better papers we publish, the higher the chance that we can secure long-term funding for the development of Docear!

So, please, don’t be shy, send an email to info@docear.org  :-).

Docear4Word 1.30: Faster and more robust BibTeX key handling

Docear4Word 1.30 is available for download. We improved the error handling, the speed, and the robustness for special characters in BibTeX keys. Here are all changes in detail

  • A database parsing error during Refresh now displays message with line and column information.
  • More robustness for special characters in BibTeX keys:
    • Changing to the official spec of allowed characters was causing too many problems for existing libraries.
    • < > & % { } ” plus whitespace characters are now disallowed but any other char is valid
    • An apostrophe ‘ is allowed in the database but is removed from the BibTeX key before use rather than treat it as invalid.
  • Some optimizations have been made so that moving through a large document with many fields should be faster
  • When the Reference tab is selected (and the Docear4Word buttons need to be updated), Docear4Word now makes the minimum number of checks necessary. It is still slower towards the end of a large document than the beginning but much improved over previous versions.
  • When the Reference tab is not selected, Docear4Word makes no checks and so cursor movement is at full speed regardless of position in the document.

Docear4LibreOffice / Docear4OpenOffice: Call for Donation (2500$)

For each Dollar, which you donate, Uberstudent.com donates another Dollar!

One of our users’ most requested feature is an add-on for LibreOffice and OpenOffice, similar to Docear4Word, which allows users to add formatted references and bibliographies in Microsoft Word based on Docear’s BibTeX files. Unfortunately, we have no skills in developing add-ons for Libre or OpenOffice, which is why we were looking for a freelancer to help us. Now, finally, we found one. The freelancer is offering to develop a pendant to Docear4Word that works with LibreOffice and OpenOffice. This means, you will be able to select a reference from Docears’ BibTeX database, and the add-on will insert the in-text citation and the bibliography in your Libre/OpenOffice document. Analog to Docear4Word, you will be able to choose from more than 2,000 citation styles to format your references.

However, the freelancer is not developing the add-on for free. He asks for 2500 US$ (~1,900€), which we believe to be a fair price. Therefore, we kindly ask you to donate, so we can pay the freelancer to develop a Docear4Libre/OpenOffice. Of course, the add-on will be open-source, reading not only Docear’s BibTeX files but also BibTeX files of other BibTeX based reference managers. The freelancer already developed a simple proof-of concept (see screenshot), which uses citeproc-java to add BibTeX based references. As such, we have no doubts that the freelancer will be able to deliver the promised add-on — if we can collect enough money.

The freelancer’s is already working on the add-on and his goal is to finish it in the next two months or so. However, as long as we cannot pay him, he will not release the add-on, even if he has finished his work (and if he learns that there are no donations coming, he might decide to stop his work at any time). Therefore, if you want a Docear4Libre/OpenOffice, please donate now! Donate 1$, 5$, 10$, 50$ or 500$ — any contribution matters, and the sooner we have all the money, the sooner you can manage your BibTeX references in LibreOffice and OpenOffice. And there is good news: Stephen from UberStudent made the very generous offer to match each donation. This means, when you donate 10$, Stephen from Uberstudent will donate 10$; when you donate 100$, Stephen donates 100$, and so on.

Donate via PayPal, or, to save PayPal fees, make a SEPA bank transfer to Docear, IBAN DE51500100600853552606, BIC PBNKDEFF. SEPA bank transfers are free of charge within the European Union.

 

 


We will keep you posted on the amount of donations, and any important news.
Read more…

Visit Docear at “CeBIT 2014″

On March 10th and March 11th we will be presenting Docear at CeBIT. CeBIT is the digital industry’s biggest and most international event and always worth a visit. We will be at HALL 9, STAND B18. Feel free to visit us and meet the Docear team in person (Stefan will be there, and maybe Joeran).

Docear 1.0.3 Beta: rate recommendation, new web interface, bug fixes, …

Update: February 18, 2014: No bugs were reported, as such we declare Docear 1.03 as stable. It can be downloaded on the normal download page.


With Docear 1.0.3 beta we have improved PDF handling, provided some help for new users and enhanced the way how you can access your mind maps online.

PDF Handling

We fixed several minor bugs with regard to PDF handling. In previous versions of Docear, nested PDF bookmarks were imported twice when you drag & dropped a PDF file to the mind map. Renaming PDF files from within Docear changed the file links in your mind maps but did not change them in your BibTeX file. Both issues are fixed now. To rename a PDF file from within Docear you just have to right-click it in Docear’s workspace panel on the left hand side and it is important that the mind maps you have linked the file in, are opened. We know, this is still not ideal, and will improve this in future versions of Docear.

Rate Your Recommendations

You already know about our recommender system for academic literature. If you want to help us improving it, you can now rate how good a specific set of recommendations reflects your personal field of interest. Btw. it would be nice if you do not rate a set of recommendations negatively only because it contains some recommendations you received previously. Currently, we have no mechanism to detect duplicate recommendations.

rate a literature recommendation set

Read more…

Docear 2013 in review and our plans for 2014

It’s almost a bit late to review 2013 but better late than never. 2013 doubtlessly was the most active and most successful year for Docear, so far. First and foremost, we finally released Docear 1.0, after releasing many Beta and Release Candidates. Of course, Docear 1.0 is far from being perfect, but we are really proud of it and we think it’s an awesome piece of software to manage references, PDFs, and much more. But there were many noteworthy events more, some of which we took pictures of:

We presented several research papers at the JCDL in Chicago, TPDL on Malta, and RecSys/RepSys in Hong Kong. It is always a pleasure to attend such conferences. Not only because they take place at really nice locations, but because you meet really interesting people (for instance Kris Jack from Mendeley, a really enthusiastic and smart guy who develops Mendeley’s recommender system, or Joseph A. Konstan, who is a true pioneer in the field of recommender systems).

2013-09-23_15-08-37--2013_in_review__Docear-reference mananagement

TPDL on Malta

2013-07-18_11-01-10--2013_in_review__Docear-reference mananagement

JCDL in Chicago

2013 hong kong

RecSys in Hong Kong

Almost every year, our mentor Prof. Andreas Nürnberger is inviting his team members to a sailing turn, and so he did 2013. For several days we were sailing the Baltic Sea, learned a lot about team work and had a lot of fun.

2013-08-13_16-25-30--2013_in_review__Docear-reference mananagement

Sailing turn with our mentor Prof. Nürnberger, the Docear team, and some PhD students of his working group

We had the honour to supervise an excellent student team at HTW Berlin thanks to Prof. Weber-Wulff. The students did a great job in developing the Docear Web prototype. It’s a pity that the prototype has not yet found its way into our live system, but we have not had the time to give the prototype the last bug fixes and features it needs. However, this is very high on our todo list.

2013-07-12_15-09-14--2013_in_review__Docear-reference mananagement

Four of the five students at HTW Berlin who developed a prototype of “Docear Web”

Docear is primarily located in Magdeburg, Germany, which is close to Berlin. Therefore, we didn’t think twice when Researchgate hosted the 10th “Recommender Stammtisch” (regulars’ table) in Berlin. There, we could listened to an enlightening talk of Andreas Lommatzsch, and an entertaining introduction of Researchgate’s CEO Ijad Madisch.

2013-11-14_21-09-31--2013_in_review__Docear-reference mananagement

Read more…

Comprehensive Comparison of Reference Managers: Mendeley vs. Zotero vs. Docear

Which one is the best reference management software? That’s a question any student or researcher should think about quite carefully, because choosing the best reference manager may save lots of time and increase the quality of your work significantly. So, which reference manager is best? Zotero? Mendeley? Docear? …? The answer is: ”It depends”, because different people have different needs. Actually, there is no such thing as the ‘best’ reference manager but only the reference manager that is best for you (even though some developers seem to believe that their tool is the only truly perfect one).

In this Blog-post, we compare Zotero, Mendeley, and Docear and we hope that the comparison helps you to decide which of the reference managers is best for you. Of course, there are many other reference managers. Hopefully, we can include them in the comparison some day, but for now we only have time to compare the three. We really tried to do a fair comparison, based on a list of criteria that we consider important for reference management software. Of course, the criteria are subjectively selected, as are all criteria by all reviewers, and you might not agree with all of them. However, even if you disagree with our evaluation, you might find at least some new and interesting aspects as to evaluate reference management tools. You are very welcome to share your constructive criticism in the comments, as well as links to other reviews. In addition, it should be obvious that we – the developers of Docear – are somewhat biased. However, this comparison is most certainly more objective than those that Mendeley and other reference managers did ;-).

Please note that we only compared about 50 high-level features and used a simple rating scheme in the summary table. Of course, a more comprehensive list of features and a more sophisticated rating scheme would have been nice, but this would have been too time consuming. So, consider this review as a rough guideline. If you feel that one of the mentioned features is particularly important to you, install the tools yourself, compare the features, and share your insights in the comments! Most importantly, please let us know when something we wrote is not correct. All reviewed reference tools offer lots of functions, and it might be that we missed one during our review.

The  table above provides an overview of how Zotero, Mendeley, and Docear support you in various tasks, how open and free they are, etc. Details on the features and ratings are provided in the following sections. As already mentioned, if you notice a mistake in the evaluation (e.g. missed a key feature), please let us know in the comments.

Overview

If you don’t want to read a lot, just jump to the summary

We believe that a reference manager should offer more features than simple reference management. It should support you in (1) finding literature, (2) organizing and annotating literature, (3) drafting your papers, theses, books, assignments, etc., (4) managing your references (of course), and (5) writing your papers, theses, etc. Additionally, many – but not all – students and researchers might be interested in (6) socializing and collaboration, (7) note, task, and general information management, and (8) file management. Finally, we think it is important that a reference manager (9) is available for the major operating systems, (10) has an information management approach you like (tables, social tags, search, …), and (11) is open, free, and sustainable (see also What makes a bad reference manager).

Read more…

Docear4Word 1.23 Released

Docear4Word LogoThe new Docear4Word v1.23 is out as Beta version. Changes are

  1. A more detailed error message when there is a parsing error in your BibTeX file.
  2. The latest v1.0.517 version of CiteProc-JS has been included. This should finally solve all the sorting and numbering issues.
  3. We made some adjustment that could improve the performance of Docear4Word. Not sure though, if it will really do.
  4. Special characters such as . ! ? _ ^< > are now allowed in the beginning of a BibTeX name.

Please note that this is probably the last version that is compiled with VS2010 (requiring you to install .NET 2.0). The next release will be compiled with VS2013 (.NET 4) which should solve some compatibility issues with Windows 8.

We have not yet thoroughly tested the new version. So, if you want to be sure to get a stable version wait a few days (if you don’t see any updates here in the next days, users didn’t report any bugs and the current version is stable).

Download Docear4Word 1.23 Beta

Do a paid internship abroad at SciPlore – Summer 2014

The SciPlore team at Google HQ in Mountain View, CA

The SciPlore team at Google HQ in Mountain View, CA

Our partnering research group SciPlore, from which Docear evolved, in cooperation with the German Academic Exchange Service (DAAD) is offering a paid internship for a Bachelor student in the field of computer science. Prerequisite for applying is that you are a student studying at a German university (if you are from the US, UK, or Canada, read here). More details on prerequisites here.

SciPlore is an international team of researchers affiliated with the University of Magdeburg in Germany and  the University of California, Berkeley. As an intern, you will have the chance to spend 6-12 weeks abroad at a research institute collaborating with the SciPlore research team.
SciPlore researches novel approaches in citation and semantic text analysis for quantifying similarities between scientific articles. Similarity assessments are crucial to many Information Retrieval (IR) tasks, such as clustering of documents, recommending academic literature, or automatically detecting plagiarism.

Read more…

Developer for Docear4Word (Mac) wanted

Since more than one year we are offering Docear4Word, an add-in for Microsoft Word that helps you creating bibliographies. Unfortunately, the add-in is only available for the Windows version of Microsoft Word. We would love to offer a Mac Version, too, but don’t have the skills to do this. If you know how to develop add-ons for MS Word on Mac please contact us at info@docear.org. The add-on should be an exact copy of  Docear4Word (Windows) and you could use Docear4Word’s source code as “inspiration”. You wouldn’t have to do this for free, we would be able to pay some money, and I am confident that our users would also donate a significant amount! Just let us know how long it would take you to  implement this add-on and how much you would want for it.

Elsevier (i.e. the owner of Mendeley) “asks” the users of Academia.edu (i.e. a competitor of Mendeley) to take their papers down

A week ago, Elsevier sent messages to some users of Academia.edu, a social network for researchers (Source: Chronicle). Elsevier asked these users to remove some of their papers from their profile page at Academia.edu. Apparently, Elsevier wasn’t happy that the authors published papers that Elsevier holds the publishing rights for. It’s an interesting discussion whether Elsevier has the right to prohibit uploading papers on Academia’s profile page, because authors have the right to publish their articles on their private homepages. Now, authors might argue that their Academia.edu profile is their private homepage.

What is even more interesting is the fact that it’s Elsevier who did this. That is the same company that recently bought the reference manger Mendeley, which, coincidentally, also offers a social network and hence is a competitor of Academia.edu. I wonder, if Elsevier will soon start to send messages to Mendeley users telling them, too, to not  upload their papers to their profile pages. Or, if Elsevier will just send these messages to users of social networks such as Academia.edu and Researchgate to strenghten their own product Mendeley. Either way, it’s not a nice move from Elsevier and confirms the negative attitude that many researchers have against this publisher and it brings back the doubt about Mendeley’s openness.

Some more detailed discussions on this topic can be found here:

Read more…

Docear 1.02 Beta: Serious PDF Bug Fix; added a donation button

We discovered a serious bug in Docear that relates to the PDF management. In some situations, it could happen that when you edited a PDF, the annotation IDs were not recognized correctly, and a conflict was shown. We fixed this bug and publish Docear 1.02 as a beta version today. Right now, the Beta version download is only available in our forum. We would appreciate if you could test the new version. If there are no more serious bugs found, we will publish it as stable version without any further notifications.

We also added a “Please Donate” note to the workspace panel. It leads you to our donation page and you are sincerely invited to make use of that page :-). If you have already donated, if you just don’t want to donate, or if you need every pixel in the workspace, do a right-click on that note and you will be able to hide it. In addition, we also changed the welcome page that opens after you have installed Docear.

New “Please donate” note in Docear

New “Welcome” page 

Read more…

Donation volume rapidly increased during the past 2 weeks



After reporting about the rather low donation volume a few weeks ago, donations rapidly increased during the past two weeks. In these two weeks we received more than 200 Dollars, which is half of what we got in the past two years. We don’t know whether this is caused by Christmas, or by whatever other reasons, but we would like to thank all donators, sincerely. Your donations significantly help to improve Docear!

 

Docear 1.01 with some minor improvements and bug fixes

A few days ago we released the experimental version of Docear and wrote about it in our experimental release forum (you can subscribe to that forum if you want to be informed about new experimental releases). Today we declare Docear 1.01 as stable and from now on it’s available on our primary download page. Changes are rather minor. 

Enhancements include

  • A slightly modified dialog for selecting your PDF viewer (some links were updated)
  • The labeling of the file monitoring settings are now more uniform
  • The colors for “Move …” in the “Nodes” ribbon were changed from green to blue. There’s quite a funny story behind it. One of our team members recently told me that the arrows for moving nodes would point to the wrong direction. I told him that they were absolutely correct and we had quite a discussion. Then we realized that the team member is (red-green) color blind and couldn’t recognize the green arrows properly. Well, now the arrows are blue (see screenshot) and all people should be able to recognize them correctly :-)

In addition, we did some bug fixes.

Read more…