Research Activities

Docear originated from the desire to know how mind maps are created and how researchers do literature management. Meanwhile we are doing research in many areas, always with one goal: improving the user experience of Docear. Read in which areas we are doing research and why.

Literature Management

Docear is an academic literature suite and of course we are interested in how people are using Docear and how people are doing literature management in general. We have done no research in this field (and published only two Demo papers about Docear and its predecessor SciPlore MindMapping.). If you are interested in e.g. doing a case study about Docear, please contact us, and we would be happy to assist you.

Joeran Beel, Bela Gipp, Stefan Langer, and Marcel Genzmehr. Docear: An Academic Literature Suite for Searching, Organizing and Creating Academic Literature. In Proceedings of the 11th annual international ACM/IEEE joint conference on Digital libraries, JCDL ’11, pages 465–466, New York, NY, USA, 2011. ACM.

Beel, Joeran, Stefan Langer, and Marcel Genzmehr. “Docear4Word: Reference Management for Microsoft Word based on BibTeX and the Citation Style Language (CSL).” In Proceedings of the 13th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL’13), 445–446. ACM, 2013.

Joeran Beel, Bela Gipp, and Christoph Mueller. ’SciPlore MindMapping’ – A Tool for Creating Mind Maps Combined with PDF and Reference ManagementD-Lib Magazine, 15 (11), November 2009. Brief Online Article.

User Modeling & Recommender Systems

Our primary research goal is to find out how we can create user models, based on our users’ data. This will allow us providing recommendations to our users for literature, conferences, funding, and whatever else is relevant for scientists. Read our papers to learn more, and contact us if you are interested in a cooperation.

Langer, Stefan, Joeran Beel. “The Comparability of Recommender System Evaluations and Characteristics of Docear’s Users.” In Proceedings of the Workshop on Recommender Systems Evaluation: Dimensions and Design (REDD), at the ACM Recommender Systems Conference (RecSys). 2014.

Beel, Joeran, Stefan Langer, Bela Gipp, Andreas Nürnberger. “The Architecture and Datasets of Docear’s Research Paper Recommender System.” In Proceedings of the 3rd International Workshop on Mining Scientific Publications (WOSP 2014) at the ACM/IEEE Joint Conference on Digital Libraries (JCDL 2014). ACM, 2014.

Beel, Joeran, Stefan Langer, Marcel Genzmehr, Bela Gipp. “Utilizing Mind-Maps for Information Retrieval and User Modelling.” In Proceedings of the 22nd Conference on User Modelling, Adaption, and Personalization (UMAP). Springer, 2014 (to appear in July).

Beel, Joeran, Stefan Langer, Marcel Genzmehr, Bela Gipp, Corinna Breitinger, and Andreas Nürnberger. “Research Paper Recommender System Evaluation: A Quantitative Literature Survey.” In Proceedings of the Workshop on Reproducibility and Replication in Recommender Systems Evaluation (RepSys) at the ACM Recommender System Conference (RecSys). ACM International Conference Proceedings Series (ICPS), 2013.

Beel, Joeran, Stefan Langer, Marcel Genzmehr, Bela Gipp, and Andreas Nürnberger. “A Comparative Analysis of Offline and Online Evaluations and Discussion of Research Paper Recommender System Evaluation.” In Proceedings of the Workshop on Reproducibility and Replication in Recommender Systems Evaluation (RepSys) at the ACM Recommender System Conference (RecSys). ACM International Conference Proceedings Series (ICPS), 2013.

Beel, Joeran, Stefan Langer, Marcel Genzmehr, and Andreas Nürnberger. “Introducing Docear’s Research Paper Recommender System.” In Proceedings of the 13th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL’13), 459–460. ACM, 2013.

Beel, Joeran, Stefan Langer, Marcel Genzmehr, and Andreas Nürnberger. “Persistence in Recommender Systems: Giving the Same Recommendations to the Same Users Multiple Times.” In Proceedings of the 17th International Conference on Theory and Practice of Digital Libraries (TPDL 2013), edited by Trond Aalberg, Milena Dobreva, Christos Papatheodorou, Giannis Tsakonas, and Charles Farrugia, 8092:390–394. Lecture Notes of Computer Science (LNCS). Valletta, Malta: Springer, 2013.

Beel, Joeran, Stefan Langer, Andreas Nürnberger, and Marcel Genzmehr. “The Impact of Demographics (Age and Gender) and Other User Characteristics on Evaluating Recommender Systems.” In Proceedings of the 17th International Conference on Theory and Practice of Digital Libraries (TPDL 2013), edited by Trond Aalberg, Milena Dobreva, Christos Papatheodorou, Giannis Tsakonas, and Charles Farrugia, 400–404. Valletta, Malta: Springer, 2013.

Beel, Joeran, Stefan Langer, and Marcel Genzmehr. “Sponsored Vs. Organic (Research Paper) Recommendations and the Impact of Labeling.” In Proceedings of the 17th International Conference on Theory and Practice of Digital Libraries (TPDL 2013), edited by Trond Aalberg, Milena Dobreva, Christos Papatheodorou, Giannis Tsakonas, and Charles Farrugia, 395–399. Valletta, Malta, 2013.

Joeran Beel and Stefan Langer. An Exploratory Analysis of Mind Maps. In Proceedings of the 11th ACM Symposium on Document Engineering (DocEng’11), Mountain View, California, USA, pages 81-84 2011. ACM.

PDF Metadata Extraction

Typing bibliographic data or creating reference lists, probably is the most annoying task for a researcher. Docear already offers some tools to facilitate this process, for instance, if you drag & drop a PDF into Docear, you will be asked whether metadata such as title, authors and journal shall be extracted from the PDF to create a reference. However, our metadata extraction (as well as that of other reference managers) could be improved and that is what we are trying to do.

Read our paper to learn more, and contact us if you are interested in a cooperation.

Beel, Joeran, Stefan Langer, Marcel Genzmehr, and Christoph Müller. “Docears PDF Inspector: Title Extraction from PDF files.” In Proceedings of the 13th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL’13), 443–444. ACM, 2013.

Lipinski, Mario, Kevin Yao, Joeran Beel, and Bela Gipp. “Evaluation of Header Metadata Extraction Approaches and Tools for Scientific PDF Documents.” In Proceedings of the 13th ACM/IEEE-CS joint conference on Digital libraries (JCDL’13), 385–386, 2013.

Joeran Beel, Bela Gipp, Ammar Shaker, and Nick Friedrich. SciPlore Xtract: Extracting Titles from Scientific PDF Documents by Analyzing Style Information (Font Size). In M. Lalmas, J. Jose, A. Rauber, F. Sebastiani, and I. Frommholz, editors, Research and Advanced Technology for Digital Libraries, Proceedings of the 14th European Conference on Digital Libraries (ECDL’10), volume 6273 of Lecture Notes of Computer Science (LNCS), pages 413–416, Glasgow (UK), September 2010. Springer.

Academic Search

We conducted several studies about Google Scholar’s ranking algorithm and its vulnerability to spam as well as about what we called “academic search engine optimization”. In addition, we analysed how data from mind maps could improve academic search engines and we developed the “Machine-readable Digital Library” Mr. DLib which hopefully will be directly integrated into Docear, soon.

Read our papers to learn more, and contact us if you are interested in a cooperation.

Joeran Beel, Bela Gipp, Stefan Langer, Marcel Genzmehr, Erik Wilde, Andreas Nürnberger, and Jim Pitman. Introducing Mr. DLib, a Machine-readable Digital Library. In Proceedings of the 11th ACM/IEEE Joint Conference on Digital Libraries (JCDL‘11), 2011. ACM.

Joeran Beel, Bela Gipp, and Erik Wilde. Academic Search Engine Optimization (ASEO): Optimizing Scholarly Literature for Google Scholar and CoJournal of Scholarly Publishing, 41 (2): 176–190, January 2010. University of Toronto Press.

Joeran Beel and Bela Gipp. Academic search engine spam and Google Scholar’s resilience against itJournal of Electronic Publishing, 13 (3), December 2010.

Joeran Beel and Bela Gipp. On the Robustness of Google Scholar Against Spam. In Proceedings of the 21st ACM Conference on Hyptertext and Hypermedia (HT’10), pages 297–298, Toronto (CA), June 2010. ACM.

Joeran Beel and Bela Gipp. Google Scholar’s Ranking Algorithm: The Impact of Citation Counts (An Empirical Study). In Andre Flory and Martine Collard, editors, Proceedings of the 3rd IEEE International Conference on Research Challenges in Information Science (RCIS’09), pages 439–446, Fez (Morocco), April 2009. IEEE.

Joeran Beel and Bela Gipp. Google Scholar’s Ranking Algorithm: An Introductory Overview. In Birger Larsen and Jacqueline Leta, editors, Proceedings of the 12th International Conference on Scientometrics and Informetrics (ISSI’09), volume 1, pages 230–241, Rio de Janeiro (Brazil), July 2009. International Society for Scientometrics and Informetrics. ISSN 2175-1935.

Joeran Beel and Bela Gipp. Google Scholar’s Ranking Algorithm: The Impact of Articles’ Age (An Empirical Study). In Shahram Latifi, editor, Proceedings of the 6th International Conference on Information Technology: New Generations (ITNG’09), pages 160–164, Las Vegas (USA), April 2009. IEEE.

Document Similarity & Plagiarism Detection

In close cooperation with our partner SciPlore we are researching how similar documents can be identified. The focus lies on similarity calculations based on citations. SciPlore also uses the approach to detect plagiarism.

Read our papers to learn more, and contact us if you are interested in a cooperation.

Bela Gipp, Norman Meuschke, and Joeran Beel. Comparative Evaluation of Text- and Citation-based Plagiarism Detection Approaches using GuttenPlag. In Proceedings of 11th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL’11), pages 255–258, Ottawa, Canada, June 2011. ACM.

Bela Gipp and Joeran Beel. Citation Based Plagiarism Detection – A New Approach to Identify Plagiarized Work Language Independently. In Proceedings of the 21st ACM Conference on Hyptertext and Hypermedia (HT’10), pages 273–274, New York, NY, USA, June 2010. ACM.

Bela Gipp, Adriana Taylor, and Joeran Beel. Link Proximity Analysis – Clustering Websites by Examining Link Proximity. In M. Lalmas, J. Jose, A. Rauber, F. Sebastiani, and I. Frommholz, editors, Proceedings of the 14th European Conference on Digital Libraries (ECDL’10): Research and Advanced Technology for Digital Libraries, volume 6273 of Lecture Notes of Computer Science (LNCS), pages 449–452, September 2010. Springer.

Bela Gipp and Joeran Beel. Citation Proximity Analysis (CPA) – A new approach for identifying related work based on Co-Citation Analysis. In Birger Larsen and Jacqueline Leta, editors, Proceedings of the 12th International Conference on Scientometrics and Informetrics (ISSI’09), volume 2, pages 571–575, Rio de Janeiro (Brazil), July 2009. International Society for Scientometrics and Informetrics.