Update of the Google Scholar PDF Metadata Retrieval Library

In the past few weeks, several users reported that Docear could not retrieve metadata more from Google Scholar any more. It took as a while, but now we found and fixed the problem – hopefully. We need your help to test if the fixed version really works well for all of you (it’s very hard to test for us because Google Scholar often did not block us while it did block some users).

Here is how you can install the new Google Scholar PDF Metadata Retrieval Library:

  1. Close Docear if it’s currently running
  2. Download the Google Scholar library.
  3. Replace the existing library with the new one. You find your old library in a path like C:\Program Files (x86)\Docear\plugins\org.docear.plugin.bibtex\lib\
  4. Start Docear
  5. Request metadata :-)

Please let us know if the new library works, in particular if you had problems with the old one. Please note that even with the new library you might enter captchas occasionally. However, with the old library, captchas were occurring again and again for some users. This problem should be fixed (after entering the captcha correctly, you should be able to retrieve metadata at least for a few dozens of PDFs).

New Pre-print: The Architecture and Datasets of Docear’s Research Paper Recommender System

Our paper “The Architecture and Datasets of Docear’s Research Paper Recommender System” was accepted at the 3rd International Workshop on Mining Scientific Publications (WOSP 2014), which is held in conjunction with the ACM/IEEE Joint Conference on Digital Libraries (JCDL 2014). This means, we will be in London from September 9 until September 13 to present our paper. If you are interested in research paper recommender systems, feel free to read the pre-print. If you find any errors, let us know before August 25 – that’s the date when we have to submit the camera ready version.

Here is the abstract:

In the past few years, we have developed a research paper recommender system for our reference management software Docear. In this paper, we introduce the architecture of the recommender system and four datasets. The architecture comprises of multiple components, e.g. for crawling PDFs, generating user models, and calculating content-based recommendations. It supports researchers and developers in building their own research paper recommender systems, and is, to the best of our knowledge, the most comprehensive architecture that has been released in this field. The four datasets contain metadata of 9.4 million academic articles, including 1.8 million articles freely available on the Web; the articles’ citation network; anonymized information on 8,059 Docear users; information about the users’ 52,202 mind-maps and personal libraries; and details on the 308,146 recommendations that the recommender system delivered. The datasets are a unique source of information to enable, for instance, research on collaborative filtering, content-based filtering, and the use of reference management and mind-mapping software.

Full-text (PDF)

Datasets (available from mid of September)

Docear 1.1.1 Beta with Academic Search Feature

As you may know, Docear features a recommender system for academic literature. To find out which papers you might be interested in, it parses your mind maps and compares them to our digital library with currently about 1.8 million academic articles. While this is helpful and might point you to papers relevant for your general research goals, you will sometimes have to find information on a specific topic and hence search directly.

Based on our knowledge about recommender systems and some user requests, we decided to implement a direct search feature on our digital library. I am very grateful to Keystone, who supported me in visiting Dr. Georgia Kapitsaki at the University of Cyprus (UCY) in Nicosia for a full month to work on this idea. Dr. Kapitsaki’s has already supported us in our work on Docear’s recommender system in July 2013. Her knowledge about the inner mechanics and her ideas on the the search engine were essential for the implementation and the research part of the project.

How to use it

You can access the search feature from Docear’s ribbon bar (“Search and Filter > Documents > Online search”) or by double-clicking the “Online search” entry in Docear’s workspace panel. Since both the recommender system and the personalized search engine make use of your mind maps. you need to enable the recommendation service in Docear.

Screenshot from 2014-07-07 15:19:39

After opening the search page, you will see

  • a text box for your search query,
  • a “Search” button, and
  • several buttons below the text box reflecting search terms you might be interested in. If Docear does not have enough data to decide about your interests, this part remains empty.

Docear-online-search-interface

Read more…

Docear’s new workspace and workflow concept: We need your feedback!

In the past years, Docear evolved to a powerful software for managing literature and references. However, we have to admit that Docear is still not as user friendly as we would like it to be. This is mainly caused by the workspace concept which is not very intuitive. We are aware of this problem and we would like to fix it. Therefore, we spent the last weeks with a lot of brainstorming and discussions, and we came up with a new concept. We believe it to be more intuitive, and more similar to the concepts you know from other reference managers. In the following, we would like to introduce our ideas for the new workspace concept and some other changes and we ask you for your feedback. Please let us know in the comments if you like our ideas, and how we could make the concept even better.

This is how the new workspace panel would look like after you freshly installed Docear and sorted a few PDFs including annotations (click the image to enlarge it).

There are four main categories in the workspace panel (left).

Read more…

Research Paper Recommender Systems: A Literature Survey (Preprint)

As some of you might know, I am a PhD student and the focus of my research lies on research paper recommender systems. Now, I am about to finish an extensive literature review of more than 200 research articles on research paper recommender systems. My colleagues and I summarized the findings in this 43 page preprint. The preprint is in an early stage, and we need to double check some numbers, improve grammar etc. but we would like to share the article anyway. If you are interested in the topic of research paper recommender system, it hopefully will give you a good overview of that field. The review is also quite critical and should give you some good ideas about the current problems and interesting directions for further research.

If you read the preprint, and find any errors, or if you have suggestions how to improve the survey, please let us know and send us an email. If you would be interested in proof-reading the article, let us know, and we will send you the MS-Word document

Abstract. Between 1998 and 2013, more than 300 researchers published more than 200 articles in the field of research paper recommender systems. We reviewed these articles and found that content based filtering was the predominantly applied recommendation concept (53%). Collaborative filtering was only applied by 12% of the reviewed approaches, and graph based recommendations by 7%. Other recommendation concepts included stereotyping, item-centric recommendations and hybrid recommendations. The content based filtering approaches mainly utilized papers that the users had authored, tagged, browsed, or downloaded. TF-IDF was the most applied weighting scheme. Stop-words were removed only by 31% of the approaches, stemming was applied by 24%. Aside from simple terms, also n-grams, topics, and citations were utilized. Our review revealed some serious limitations of the current research. First, it remains unclear which recommendation concepts and techniques are most promising. Different papers reported different results on, for instance, the performance of content based and collaborative filtering. Sometimes content based filtering performed better than collaborative filtering and sometimes it was exactly the opposite. We identified three potential reasons for the ambiguity of the results. First, many of the evaluations were inadequate, for instance, with strongly pruned datasets, few participants in user studies, and no appropriate baselines. Second, most authors provided sparse information on their algorithms, which makes it difficult to re-implement the approaches or to analyze why evaluations might have provided different results. Third, we speculated that there is no simple answer to finding the most promising approaches and minor variations in datasets, algorithms, or user population inevitable lead to strong variations in the performance of the approaches. A second limitation related to the fact that many authors neglected factors beyond accuracy, for example overall user satisfaction and satisfaction of developers. In addition, the user modeling process was widely neglected. 79% of the approaches let their users provide some keywords, text snippets or a single paper as input, and did not infer information automatically. Only for 11% of the approaches, information on runtime was provided. Finally, it seems that much of the research was conducted in the ivory tower. Barely any of the research had an impact on the research paper recommender systems in practice, which mostly use very simple recommendation approaches. We also identified a lack of authorities and persistence: 67% of the authors authored only a single paper, and there was barely any cooperation among different co-author groups. We conclude that several actions need to be taken to improve the situation. Among others, a common evaluation framework is needed, a discussion about which information to provide in research papers, a stronger focus on non-accuracy aspects and user modeling, a platform for researchers to exchange information, and an open-source framework that bundles the available recommendation approaches.

New Paper: Utilizing Mind-Maps for Information Retrieval and User Modelling

We recently submitted a paper to UMAP (The Conference on User Modelling, Adaptation, and Personalization). The paper was about how mind-maps could be utilized by information retrieval applications. The paper got accepted, which means we will be in Aalborg, Denmark from July 7 until July 11 to present the paper. If you are a researcher in the field of information retrieval, or user modelling, or mind-mapping, you might be interested in the pre-print. Btw. if you find any errors in it, we would highly appreciate if you told us (ideally today). Similarly, if you are interested in a research partnership, or if you are also at UMAP 2014 and would like to discuss our research, please contact us.

Abstract. Mind-maps have been widely neglected by the information retrieval (IR) community. However, there are an estimated two million active mind-map users, 300,000 public mind-maps, and 5 million new, non-public, mind-maps created every year – a potentially rich source for information retrieval applications. In this paper, we present eight ideas on how mind-maps could be utilized by IR applications. For instance, mind-maps could be utilized to generate user models for recommender systems or expert search, or to calculate relatedness of web-pages that are linked in mind-maps. We evaluated the feasibility of the eight ideas, based on estimates of the number of available mind-maps, an analysis of the content of mind-maps, and an evaluation of the users’ acceptance of the ideas. Based on this information, we concluded that user modelling is the most promising application with respect to mind-maps. A user modelling prototype, i.e. a recommender system for the users of our mind-mapping software Docear, was implemented, and its effectiveness evaluated. Depending on the applied user modelling approaches, the effectiveness, i.e. click-through rate on recommendations, varied between 0.28% and 6.24%. This indicates that mind-map based user modelling is promising, but not trivial, and that further research is required to increase effectiveness.

View pre-print

 

Wanted: Participants for a User Study about Docear’s Recommender System

We kindly ask you to participate in a brief study about Docear’s recommender system. Your participation will help us to improve the recommender system, and to secure long-term funding for the development of Docear in general! If you are willing to invest 15 minutes of your time, then please continue reading.

Participate in the Study

  1. Start Docear
  2. Click on the “Show Recommendations” button.
  3. Click on all recommendations, so they open in your web-browser. Click on them even if you know a paper already, or if a paper was recommended previously.
  4. For each recommended paper, please read at least the abstract. You may also read the entire paper if you like, or at least skim through it.
  5. Rate the recommendations: The better the current recommendations are, the more stars you should give. Please note that:
    • Ratings should only reflect the relevance of the current set of recommendations. Do not rate recommendations based on the quality of previously received recommendations.
    • Please do not rate recommendations poorly only because they were shown previously. When you receive recommendations that were shown previously, you should give the same rating as previously.
    • Please do not rate recommendations poorly because the link to the PDF was dead, and you could not read the PDF. Dead links do not depend on the recommender system, and you should just ignore them. In other words: Please rate only the quality of those recommendations that you could actually read. If none of the recommendations you clicked could be read, just give no rating at all.
  6. The user study runs until July 15th. We would highly appreciate if you could receive and rate recommendations during that period a couple of times (click the green refresh icon to receive new recommendations). The more recommendations you rate, the better for our research. However, even if you rate recommendations only once, you will help us a lot.

Very important: Please let us know your Docear username so we know you participated in the study – send it to info@docear.org. You can find your user name in the bottom-left corner of the status bar (see picture).

In addition, we would very much appreciate if you provide us with the following information:

Age:
Gender:
Nationality:
Status: (e.g. Professor, Phd Student, Master Student, …)
Field of  Research:
Since when are you using Docear approximately:

Docear4Word 1.30: Faster and more robust BibTeX key handling

Docear4Word 1.30 is available for download. We improved the error handling, the speed, and the robustness for special characters in BibTeX keys. Here are all changes in detail

  • A database parsing error during Refresh now displays message with line and column information.
  • More robustness for special characters in BibTeX keys:
    • Changing to the official spec of allowed characters was causing too many problems for existing libraries.
    • < > & % { } ” plus whitespace characters are now disallowed but any other char is valid
    • An apostrophe ‘ is allowed in the database but is removed from the BibTeX key before use rather than treat it as invalid.
  • Some optimizations have been made so that moving through a large document with many fields should be faster
  • When the Reference tab is selected (and the Docear4Word buttons need to be updated), Docear4Word now makes the minimum number of checks necessary. It is still slower towards the end of a large document than the beginning but much improved over previous versions.
  • When the Reference tab is not selected, Docear4Word makes no checks and so cursor movement is at full speed regardless of position in the document.

Docear4LibreOffice / Docear4OpenOffice: Call for Donation (2500$)

One of our users’ most requested feature is an add-on for LibreOffice and OpenOffice, similar to Docear4Word, which allows adding formatted references and bibliographies in Microsoft Word based on Docear’s BibTeX files. Unfortunately, we have no skills in developing add-ons for Libre or OpenOffice, which is why we were looking for a freelancer to help us. Now, finally, we found one. The freelancer is offering to develop a pendant to Docear4Word that works with LibreOffice and OpenOffice. This means, you will be able to select a reference from Docears’ BibTeX database, and the add-on will insert the in-text citation and the bibliography in your Libre/OpenOffice document. Analog to Docear4Word, you will be able to choose from more than 2,000 citation styles to format your references.

However, the freelancer is not developing the add-on for free. He asks for 2500 US$ (~1,900€), which we believe to be a fair price. Therefore, we kindly ask you to donate, so we can pay the freelancer to develop a Docear4Libre/OpenOffice. Of course, the add-on will be open-source, reading not only Docear’s BibTeX files but also BibTeX files of other BibTeX based reference managers. The freelancer already developed a simple proof-of concept (see screenshot), which uses citeproc-java to add BibTeX based references. As such, we have no doubts that the freelancer will be able to deliver the promised add-on — if we can collect enough money.

The freelancer’s is already working on the add-on and his goal is to finish it in the next two months or so. However, as long as we cannot pay him, he will not release the add-on, even if he has finished his work (and if he learns that there are no donations coming, he might decide to stop his work at any time). Therefore, if you want a Docear4Libre/OpenOffice, please donate now! Donate 1$, 5$, 10$, 50$ or 500$ — any contribution matters, and the sooner we have all the money, the sooner you can manage your BibTeX references in LibreOffice and OpenOffice.

Donate via PayPal, or, to save PayPal fees, make a SEPA bank transfer to Docear, IBAN DE18700222000020015578, BIC FDDODEMMXXX. SEPA bank transfers are free of charge within the European Union.

 

 

 

 

 


We will keep you posted on the amount of donations, and any important news.
Read more…

Visit Docear at “CeBIT 2014″

On March 10th and March 11th we will be presenting Docear at CeBIT. CeBIT is the digital industry’s biggest and most international event and always worth a visit. We will be at HALL 9, STAND B18. Feel free to visit us and meet the Docear team in person (Stefan will be there, and maybe Joeran).

Docear 1.0.3 Beta: rate recommendation, new web interface, bug fixes, …

Update: February 18, 2014: No bugs were reported, as such we declare Docear 1.03 as stable. It can be downloaded on the normal download page.


With Docear 1.0.3 beta we have improved PDF handling, provided some help for new users and enhanced the way how you can access your mind maps online.

PDF Handling

We fixed several minor bugs with regard to PDF handling. In previous versions of Docear, nested PDF bookmarks were imported twice when you drag & dropped a PDF file to the mind map. Renaming PDF files from within Docear changed the file links in your mind maps but did not change them in your BibTeX file. Both issues are fixed now. To rename a PDF file from within Docear you just have to right-click it in Docear’s workspace panel on the left hand side and it is important that the mind maps you have linked the file in, are opened. We know, this is still not ideal, and will improve this in future versions of Docear.

Rate Your Recommendations

You already know about our recommender system for academic literature. If you want to help us improving it, you can now rate how good a specific set of recommendations reflects your personal field of interest. Btw. it would be nice if you do not rate a set of recommendations negatively only because it contains some recommendations you received previously. Currently, we have no mechanism to detect duplicate recommendations.

rate a literature recommendation set

Read more…

Docear 2013 in review and our plans for 2014

It’s almost a bit late to review 2013 but better late than never. 2013 doubtlessly was the most active and most successful year for Docear, so far. First and foremost, we finally released Docear 1.0, after releasing many Beta and Release Candidates. Of course, Docear 1.0 is far from being perfect, but we are really proud of it and we think it’s an awesome piece of software to manage references, PDFs, and much more. But there were many noteworthy events more, some of which we took pictures of:

We presented several research papers at the JCDL in Chicago, TPDL on Malta, and RecSys/RepSys in Hong Kong. It is always a pleasure to attend such conferences. Not only because they take place at really nice locations, but because you meet really interesting people (for instance Kris Jack from Mendeley, a really enthusiastic and smart guy who develops Mendeley’s recommender system, or Joseph A. Konstan, who is a true pioneer in the field of recommender systems).

2013-09-23_15-08-37--2013_in_review__Docear-reference mananagement

TPDL on Malta

2013-07-18_11-01-10--2013_in_review__Docear-reference mananagement

JCDL in Chicago

2013 hong kong

RecSys in Hong Kong

Almost every year, our mentor Prof. Andreas Nürnberger is inviting his team members to a sailing turn, and so he did 2013. For several days we were sailing the Baltic Sea, learned a lot about team work and had a lot of fun.

2013-08-13_16-25-30--2013_in_review__Docear-reference mananagement

Sailing turn with our mentor Prof. Nürnberger, the Docear team, and some PhD students of his working group

We had the honour to supervise an excellent student team at HTW Berlin thanks to Prof. Weber-Wulff. The students did a great job in developing the Docear Web prototype. It’s a pity that the prototype has not yet found its way into our live system, but we have not had the time to give the prototype the last bug fixes and features it needs. However, this is very high on our todo list.

2013-07-12_15-09-14--2013_in_review__Docear-reference mananagement

Four of the five students at HTW Berlin who developed a prototype of “Docear Web”

Docear is primarily located in Magdeburg, Germany, which is close to Berlin. Therefore, we didn’t think twice when Researchgate hosted the 10th “Recommender Stammtisch” (regulars’ table) in Berlin. There, we could listened to an enlightening talk of Andreas Lommatzsch, and an entertaining introduction of Researchgate’s CEO Ijad Madisch.

2013-11-14_21-09-31--2013_in_review__Docear-reference mananagement

Read more…

Comprehensive Comparison of Reference Managers: Mendeley vs. Zotero vs. Docear

Which one is the best reference management software? That’s a question any student or researcher should think about quite carefully, because choosing the best reference manager may save lots of time and increase the quality of your work significantly. So, which reference manager is best? Zotero? Mendeley? Docear? …? The answer is: “It depends”, because different people have different needs. Actually, there is no such thing as the ‘best’ reference manager but only the reference manager that is best for you (even though some developers seem to believe that their tool is the only truly perfect one).

In this Blog-post, we compare Zotero, Mendeley, and Docear and we hope that the comparison helps you to decide which of the reference managers is best for you. Of course, there are many other reference managers. Hopefully, we can include them in the comparison some day, but for now we only have time to compare the three. We really tried to do a fair comparison, based on a list of criteria that we consider important for reference management software. Of course, the criteria are subjectively selected, as are all criteria by all reviewers, and you might not agree with all of them. However, even if you disagree with our evaluation, you might find at least some new and interesting aspects as to evaluate reference management tools. You are very welcome to share your constructive criticism in the comments, as well as links to other reviews. In addition, it should be obvious that we – the developers of Docear – are somewhat biased. However, this comparison is most certainly more objective than those that Mendeley and other reference managers did ;-).

Please note that we only compared about 50 high-level features and used a simple rating scheme in the summary table. Of course, a more comprehensive list of features and a more sophisticated rating scheme would have been nice, but this would have been too time consuming. So, consider this review as a rough guideline. If you feel that one of the mentioned features is particularly important to you, install the tools yourself, compare the features, and share your insights in the comments! Most importantly, please let us know when something we wrote is not correct. All reviewed reference tools offer lots of functions, and it might be that we missed one during our review.

Please note that the developers of all three tools constantly improve their tools and add new features. Therefore, the table might be not perfectly up-to-date. In addition, it’s difficult to rate a particular functionality with only one out of three possible ratings (yes; no; partly). Therefore, we highly suggest to read the detailed review, which explains the rationale behind the ratings.

The  table above provides an overview of how Zotero, Mendeley, and Docear support you in various tasks, how open and free they are, etc. Details on the features and ratings are provided in the following sections. As already mentioned, if you notice a mistake in the evaluation (e.g. missed a key feature), please let us know in the comments.

Overview

If you don’t want to read a lot, just jump to the summary

We believe that a reference manager should offer more features than simple reference management. It should support you in (1) finding literature, (2) organizing and annotating literature, (3) drafting your papers, theses, books, assignments, etc., (4) managing your references (of course), and (5) writing your papers, theses, etc. Additionally, many – but not all – students and researchers might be interested in (6) socializing and collaboration, (7) note, task, and general information management, and (8) file management. Finally, we think it is important that a reference manager (9) is available for the major operating systems, (10) has an information management approach you like (tables, social tags, search, …), and (11) is open, free, and sustainable (see also What makes a bad reference manager).

Read more…

Docear4Word 1.23 Released

Docear4Word LogoThe new Docear4Word v1.23 is out as Beta version. Changes are

  1. A more detailed error message when there is a parsing error in your BibTeX file.
  2. The latest v1.0.517 version of CiteProc-JS has been included. This should finally solve all the sorting and numbering issues.
  3. We made some adjustment that could improve the performance of Docear4Word. Not sure though, if it will really do.
  4. Special characters such as . ! ? _ ^< > are now allowed in the beginning of a BibTeX name.

Please note that this is probably the last version that is compiled with VS2010 (requiring you to install .NET 2.0). The next release will be compiled with VS2013 (.NET 4) which should solve some compatibility issues with Windows 8.

We have not yet thoroughly tested the new version. So, if you want to be sure to get a stable version wait a few days (if you don’t see any updates here in the next days, users didn’t report any bugs and the current version is stable).

Download Docear4Word 1.23 Beta