Update of the Google Scholar PDF Metadata Retrieval Library

In the past few weeks, several users reported that Docear could not retrieve metadata more from Google Scholar any more. It took as a while, but now we found and fixed the problem – hopefully. We need your help to test if the fixed version really works well for all of you (it’s very hard to test for us because Google Scholar often did not block us while it did block some users).

Here is how you can install the new Google Scholar PDF Metadata Retrieval Library:

  1. Close Docear if it’s currently running
  2. Download the Google Scholar library.
  3. Replace the existing library with the new one. You find your old library in a path like C:\Program Files (x86)\Docear\plugins\org.docear.plugin.bibtex\lib\
  4. Start Docear
  5. Request metadata 🙂

Please let us know if the new library works, in particular if you had problems with the old one. Please note that even with the new library you might enter captchas occasionally. However, with the old library, captchas were occurring again and again for some users. This problem should be fixed (after entering the captcha correctly, you should be able to retrieve metadata at least for a few dozens of PDFs).

31 comments to Update of the Google Scholar PDF Metadata Retrieval Library

  • Katrina

    I just downloaded docear and after creating a set of references (without any problems) it stopped working after like 30 documents. Captcha request keep appearing and after entering the correct word the next one comes right up. It does not manage to find anything on Google Scholar anymore, the Captcha Requests just keep going on…

    I’d be grateful for some help with this issue.

    Best regards

    Katrina

  • Katrina

    …one additional question: I followed the instructions given above about how to solve this problem. I was wondering though, how to actually replace the library. If I just delete the old one, I assume all my stored data will be gone? If I put the new library in the old place, do I need to adjust the name or something else in order to make it work?

    Thanks, best

    Katrina

    • Hi Katrina,

      there is a difference between your “reference library”, i.e. the BibTeX file in which all your references are stored, and the “software library” that is responsible for fetching metadata from Google Scholar. In this post I am referring to the software library. So, if you exactly follow the instructions, none of your existing references will be deleted. Just download http://downloads.docear.org/docear-metadata-lib-0.0.1.jar and save the file docear-metadata-lib-0.0.1.jar in e.g. C:\Program Files (x86)\Docear\plugins\org.docear.plugin.bibtex\lib\ (there will already be a file called “docear-metadata-lib-0.0.1.jar” <- overwrite it with the new file)

  • Chris

    Joeran,

    I just saw this update, started using it, and it’s worked very well for me. Very happy indeed.

  • Leon

    I also implemented the .jar switch.

    When I restarted Docear, I only needed to enter one captcha and the meta-retrieval was up and running.

    Thanks!

  • Edgar

    Hello. I changed the library but it still doesn´t work. I tried out several papers and none could retrieve a thing. Any additional idea of what could I do to succesfully retrieve metadata? Thanks.

  • sw

    it worked! wonderful, I was about to delete docear, although I loved it for it’s mindmap function. Was really frustrated not beeing able to automaically read in my loads of pdf. Now everything is fine. Thank you. Anyway it was not to easy to find this site with the solution. I tried the five different methods you proposed in another entry. Maybe delete the other one or put a link?

  • Henrik

    Hello,
    I changed the library too, but theres no change. When I search a title docear starts but searching but stops immediately and there are no results. A manual search in google scholar always shows the right paper.

    I’m running Docear 1.1.1 build239 on a Windows 7 system.

    Thanks

  • Claudio

    Hi,

    Just starting to use Docear … Metadata extraction is not working at all. I´ve tried all solutions. Was never asked for a captcha.

    Any other possibilities? Updates?

    Thanks

  • Federico

    I have just started a new project, some months after my last use of Docear and I agree with Claudio: metadata extraction is not working at all. Any workarounds for it?

  • Vlad

    Hello,

    Doesn’t work at all

    I’ve tried to Update Java, Download new Google Scholar PDF Metadata Retrieval Library, Clear all settings
    Never get asked for captchas just says ‘Fetched 0 entries’

  • Brett

    Similar to Vlad, I downloaded the new file and still have the problem. I have never been asked to enter a captcha. All I see is “Fetched 0 entries”.

    I should clarify, with the new library the screen doesn’t even pop up. So I don’t see anything with the newer library.

  • Nicholas

    I have the same problem as described by Vlad

  • einhander

    It works, thanks.

  • johannes

    Same problems as Vlad.

    In my log file I find the fallowing message. Hope it helps

    STDOUT: scholar.google.comhttpMay 05, 2015 9:44:15 AM org.docear.metadata.extractors.GoogleScholarExtractor search

    INFO: scholar.google.comhttp
    java.net.UnknownHostException: scholar.google.comhttp
    at java.net.AbstractPlainSocketImpl.connect(Unknown Source)
    at java.net.PlainSocketImpl.connect(Unknown Source)
    at java.net.SocksSocketImpl.connect(Unknown Source)
    at java.net.Socket.connect(Unknown Source)
    at sun.net.NetworkClient.doConnect(Unknown Source)
    at sun.net.www.http.HttpClient.openServer(Unknown Source)
    at sun.net.www.http.HttpClient.openServer(Unknown Source)
    at sun.net.www.http.HttpClient.(Unknown Source)
    at sun.net.www.http.HttpClient.New(Unknown Source)
    at sun.net.www.http.HttpClient.New(Unknown Source)
    at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(Unknown Source)
    at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(Unknown Source)
    at sun.net.www.protocol.http.HttpURLConnection$6.run(Unknown Source)
    at sun.net.www.protocol.http.HttpURLConnection$6.run(Unknown Source)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.security.AccessController.doPrivileged(Unknown Source)
    at sun.net.www.protocol.http.HttpURLConnection.plainConnect(Unknown Source)
    at sun.net.www.protocol.http.HttpURLConnection.connect(Unknown Source)
    at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:439)
    at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:424)
    at org.jsoup.helper.HttpConnection.execute(HttpConnection.java:178)
    at org.docear.metadata.extractors.GoogleScholarExtractor.search(GoogleScholarExtractor.java:76)
    at org.docear.metadata.extractors.GoogleScholarExtractor.call(GoogleScholarExtractor.java:243)
    at org.docear.metadata.extractors.GoogleScholarExtractor.call(GoogleScholarExtractor.java:1)
    at java.util.concurrent.FutureTask.run(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.lang.Thread.run(Unknown Source)

  • Steve

    Hi. Reporting success.
    Google Scholar just recently started blocking automatic meta data retrieval. A few captchas were presented but no effect.
    I followed your procedure and now it works again.
    Thankyou very much.

  • Boyd Blackwell

    same error – is this a corrupt address? scholar.google.comhttp

    The new library as per october 13th 2014 post worked once, no captchas, then stopped with the same error always.

    STDOUT: Framework launched
    STDOUT: REMINDERHOOK: org.freeplane.features.mode.mindmapmode.MModeController@1933db2Jun 22, 2015 4:15:57 PM org.freeplane.core.util.LogUtils info
    INFO: requesting mode: MindMap
    Jun 22, 2015 4:16:05 PM org.freeplane.core.util.LogUtils info
    INFO: menu items to execute: []

    STDOUT: scholar.google.comhttpJun 22, 2015 4:16:36 PM org.docear.metadata.extractors.GoogleScholarExtractor search
    INFO: scholar.google.comhttp
    java.net.UnknownHostException: scholar.google.comhttp
    at java.net.AbstractPlainSocketImpl.connect(Unknown Source)
    at java.net.PlainSocketImpl.connect(Unknown Source)
    at java.net.SocksSocketImpl.connect(Unknown Source)
    at java.net.Socket.connect(Unknown Source)

  • Edwin

    Did not work.

    Output before changing docear-metadata-lib-0.0.1.jar

    11-Aug-2015 06:57:38 org.docear.metadata.extractors.GoogleScholarExtractor search
    INFO: scholar.google.comhttp
    java.net.UnknownHostException: scholar.google.comhttp
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:175)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:385)
    at java.net.Socket.connect(Socket.java:546)
    at sun.net.NetworkClient.doConnect(NetworkClient.java:173)
    at sun.net.www.http.HttpClient.openServer(HttpClient.java:427)
    at sun.net.www.http.HttpClient.openServer(HttpClient.java:529)
    at sun.net.www.http.HttpClient.(HttpClient.java:213)
    at sun.net.www.http.HttpClient.New(HttpClient.java:306)
    at sun.net.www.http.HttpClient.New(HttpClient.java:325)
    at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:955)
    at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:891)
    at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:809)
    at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:439)
    at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:424)
    at org.jsoup.helper.HttpConnection.execute(HttpConnection.java:178)
    at org.docear.metadata.extractors.GoogleScholarExtractor.search(GoogleScholarExtractor.java:74)
    at org.docear.metadata.extractors.GoogleScholarExtractor.call(GoogleScholarExtractor.java:216)
    at org.docear.metadata.extractors.GoogleScholarExtractor.call(GoogleScholarExtractor.java:1)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:701)

    Output after changing docear-metadata-lib-0.0.1.jar

    STDOUT: scholar.google.comhttp11-Aug-2015 07:19:36 org.docear.metadata.extractors.GoogleScholarExtractor search
    INFO: scholar.google.comhttp
    java.net.UnknownHostException: scholar.google.comhttp
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:175)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:385)
    at java.net.Socket.connect(Socket.java:546)
    at sun.net.NetworkClient.doConnect(NetworkClient.java:173)
    at sun.net.www.http.HttpClient.openServer(HttpClient.java:427)
    at sun.net.www.http.HttpClient.openServer(HttpClient.java:529)
    at sun.net.www.http.HttpClient.(HttpClient.java:213)
    at sun.net.www.http.HttpClient.New(HttpClient.java:306)
    at sun.net.www.http.HttpClient.New(HttpClient.java:325)
    at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:955)
    at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:891)
    at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:809)
    at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:439)
    at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:424)
    at org.jsoup.helper.HttpConnection.execute(HttpConnection.java:178)
    at org.docear.metadata.extractors.GoogleScholarExtractor.search(GoogleScholarExtractor.java:76)
    at org.docear.metadata.extractors.GoogleScholarExtractor.call(GoogleScholarExtractor.java:243)
    at org.docear.metadata.extractors.GoogleScholarExtractor.call(GoogleScholarExtractor.java:1)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:701)

    I only have this problem at University, not at home. I wonder if this has to do with the fact that google scholar tells me what paper my University is subscribed to. Maybe this is to keep people from automatically fetching papers.

  • Drew Thayer

    I just tried this (replacing the downloaded file in the recommended folder path) and it works. Thanks.

  • Dan

    The file is not available, is that normal ? (working on v1.2)

  • Hidayet Tutun

    Hı,
    I am using new version of Docear, but, Metadata extraction is not working. you said it was included in Docear 1.2. it did not work, what should we do to work again maetadata extraction.

  • Urvashi

    Downloaded the latest version. It worked fine for a while, but then started having the same issues as the previous version for retrieving articles. What to do?

  • Angus

    I’m having the same problem. I used Docear just long enough to fall in love with this feature and now it has stopped retrieving articles. I’d love a suggestion as to how to move forward.

  • Sanket Gupte

    I am having the same problem now. I am so much addicted to retrieving the references easily, that the alternative is really cumbersome.

  • janewaite

    I am having the same problem. Please can someone help?

  • Nicole

    I cannot retrieve metadata, it keeps searching and never finishes. Please help solve this problem with metadata retrieval. It worked before but has now stopped working.

    • I am sorry, we don’t have the capacity to do any changes on the meta data retrieval. If it doesn’t work (any more), I am afraid you need to manually import the data e.g. from Google Scholar (you can search on Google Scholar for the paper, and then copy and paste the BibTeX code from Google Scholar to Docear).

Leave a Reply

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

  

  

  

Time limit is exhausted. Please reload CAPTCHA.