Commit Graph

286 Commits

Author SHA1 Message Date
Sean Takats
b70cb4a2e6 Fixes JSTOR single item save; copies abbreviated publication title to full title if full title is absent in RIS (reported problem with Nature); fixes Voyager to handle UPenn catalog 2007-02-27 19:40:28 +00:00
Dan Stillman
fe0c574dee Updated JSTOR regex from Sean to work on articles from JSTOR's sandbox -- pushed to repo 2007-02-21 18:58:35 +00:00
Sean Takats
2e1fa819ab Closes #516 for PubMed direct hits and refines Max Planck VL Library support 2007-02-15 22:42:36 +00:00
Sean Takats
febd827a90 New translator: Max Planck Virtual Laboratory Library 2007-02-07 02:05:38 +00:00
Dan Stillman
bbb1236ae8 EBSCOhost and ScienceDirect to repo 2007-01-27 16:06:57 +00:00
Simon Kornblith
c0350f1c14 references #407, error in EBSCOhost translator. while the issue has been fixed on our end, at the moment, EBSCO's export appears to be broken, returning an empty file instead of RIS output. there's a relatively easy workaround if they don't get it working in a day or two.
closes #517, ScienceDirect translator fails
2007-01-27 10:22:17 +00:00
Dan Stillman
7f11c3fe61 Pushed ProQuest, ScienceDirect, SpringerLink, Nature to repo 2007-01-27 07:57:43 +00:00
Simon Kornblith
fe9e699e5e closes #488, Proquest translator broken
references #502, Special handling for automatic tags (support is now enabled, but not tag type is not maintained during RDF export)
references #517, ScienceDirect translator fails (I fixed the issue I had translating this page, but I think the reported error may be different)
2007-01-27 06:24:31 +00:00
Simon Kornblith
7a4b87257c closes #505, Bibliography alpha sorting by case
closes #376, Bibliography export order jumbled
closes #482, Tag selector does not refresh on import/delete
closes #499, zotero RDF import of attachments has a flaw
closes #500, Improve COinS handling of other item types

- fixes an issue with importing directory hierarchy
- fixes an issue where the SpringerLink translator could fail to recognize a scrapable resource
- fixes an issue where the Nature translator could fail to retrieve an associated PDF

feel free to push the updates to the SpringerLink and Nature translators to the repository; theoretically, the RDF translator should be backwards-compatible too, but I'd like to test it with b3 before potentially breaking functionality.
2007-01-27 05:00:13 +00:00
Sean Takats
8a3bca8307 Addresses #501. Had a problem with an accented character in a regular expression 2007-01-24 01:32:25 +00:00
Dan Stillman
9042365fd8 Closes #501
Sudoc translator to repo
2007-01-23 23:28:35 +00:00
Sean Takats
12a30c8e2f Addresses #501 2007-01-23 23:24:51 +00:00
Dan Stillman
61dba96bf8 RIS update from Sean (pushed to repo) (ignore some extra whitespace differences) 2007-01-20 00:21:58 +00:00
Dan Stillman
1112fcf1ff Google Scholar fix from Sean (posted to repo) 2007-01-18 22:58:36 +00:00
Dan Stillman
cfb297dfeb Updated arXiv.org translator pushed to repo 2007-01-12 20:01:34 +00:00
Sean Takats
403d7602e7 New arXiv.org translator retrieves metadata from OAI-PMH service 2007-01-12 19:17:15 +00:00
Dan Stillman
a1b351ed20 Pushed fixed RIS translator to repo
Refs #495
2007-01-12 00:34:06 +00:00
Simon Kornblith
6f6cec3300 closes #495, RIS export failing (feel free to push this out to the repository)
removes unnecessary debug code from browser.js
2007-01-12 00:19:22 +00:00
Dan Stillman
dd4ab35dc4 Pushed fixed ISI Web of Knowledge translator to repo 2007-01-10 05:03:14 +00:00
Simon Kornblith
c37aea8165 oops; update timestamps 2007-01-10 00:19:07 +00:00
Simon Kornblith
9d39f73947 - fixes issues with the ISI Web of Knowledge translator. in the process of testing, I realized that, when searching the Web of Knowledge for common words (e.g., "quark"), the Web of Knowledge does not return a meaningful set of results. neither the "Web of Science" links, nor the export feature (through Zotero or the web interface) work at all. perhaps this is something to contact ISI about?
- fixes miscellaneous issues with frames (not relevant to b3)
2007-01-10 00:17:52 +00:00
Dan Stillman
62f273e034 Merge r1047 through r1063 back to branch 2007-01-08 09:33:36 +00:00
Simon Kornblith
da1f4944f5 ...and another tweak 2006-12-21 23:46:27 +00:00
Simon Kornblith
4881a2bd7c tweak HighWire regex 2006-12-21 23:33:31 +00:00
Simon Kornblith
259700f3d7 closes #468, make RIS disregard headers 2006-12-21 22:27:21 +00:00
Simon Kornblith
42a3ac9cb1 fix missing comma in book section 2006-12-21 22:13:05 +00:00
Simon Kornblith
d2e4063b20 fixes #466, Miscellaneous Chicago Manual of Style formatting problems 2006-12-21 22:09:45 +00:00
Simon Kornblith
b650d99ef3 fix missing "l" (r1000) 2006-12-20 12:11:33 +00:00
Simon Kornblith
ae41107c59 support HighWire metasearch when accessed through HighWire site 2006-12-20 11:59:23 +00:00
Simon Kornblith
060512a379 handle all HighWire journals (Oxford, Science, etc.) with one translator 2006-12-20 11:45:00 +00:00
Simon Kornblith
a33fbd0834 add ACS Publications translator 2006-12-20 11:08:34 +00:00
Simon Kornblith
9ae9045664 - add Chicago Manual of Style (Note with Bibliography)
- only use numbers when outputting note citations
- fix type conditionals
2006-12-20 09:26:25 +00:00
Simon Kornblith
5ddd30eed8 - closes #458, chicago manual of style bibliography style. Sean and others, please test this if you have the time.
- fixes conditional bugs in cite.js. i haven't had to use the conditionals until now, but Chicago has some bizarre formatting rules (especially 17.169)
- removes extraneous debug code
2006-12-20 05:01:13 +00:00
Simon Kornblith
8515b0551a add AMS MathSciNet 2006-12-20 02:29:39 +00:00
Simon Kornblith
4afeb64196 support Oxford Journals table of contents 2006-12-20 01:37:01 +00:00
Simon Kornblith
cfb9397638 add Oxford Journals. PDF downloads don't work quite right, because of #460. 2006-12-20 00:03:16 +00:00
Simon Kornblith
39025ab461 support Gale Literature Resource Center MLA Bibliography 2006-12-18 20:28:06 +00:00
Simon Kornblith
4e4c46f5c4 adapt InfoTrac College Edition translator to work with MLA Bibliography (version 1). there's also another MLA bibliography under the GaleNet Literature Resource Center that we don't yet support 2006-12-18 11:04:59 +00:00
Simon Kornblith
511f7ec77d - add DB load and save features to Scaffold
- add ECL notice to XUL files that were missing it
2006-12-18 08:01:04 +00:00
Simon Kornblith
0c06cecebd - add Web of Science (but not Web of Knowledge CrossSearch) translator
- allow forcing of content type in doPost, although this luckily was not yet needed
- add missing Scaffold copy icon
2006-12-18 03:11:35 +00:00
Simon Kornblith
801a6c5dd2 - don't try to scrape JSTOR article information pages
- get rid of some console errors
2006-12-17 12:35:17 +00:00
Simon Kornblith
46607d0f90 support table of contents in JSTOR 2006-12-17 12:25:37 +00:00
Simon Kornblith
f65a62e0da added ACM 2006-12-17 11:53:29 +00:00
Simon Kornblith
b482e75bea fixed InnoPAC translator on University of Washington catalog 2006-12-17 09:54:37 +00:00
Simon Kornblith
db5ba3f2f0 - added Cambridge Scientific Abstracts
- fixed date generation in Scaffold
2006-12-17 09:44:59 +00:00
Simon Kornblith
448faedab5 - added a "copy" feature to Scaffold, which copies a translator to the clipboard
- implemented ability to test regex and run detectCode from within Scaffold. it is now possible to generate an entire translator from within the environment.
- added Factiva translator, which should work, although Factiva just went down for maintenance a few minutes ago
2006-12-17 01:27:42 +00:00
Simon Kornblith
cd557c2537 add IEEE Xplore translator 2006-12-16 21:32:47 +00:00
Simon Kornblith
e1489dfb06 add nature translator 2006-12-16 09:16:48 +00:00
Simon Kornblith
df91887d52 oops 2006-12-16 06:08:08 +00:00
Simon Kornblith
cf7e7e3ca6 - adds support for SpringerLink
- fixes a bug in line reading interface when called from another scraper
2006-12-16 06:05:51 +00:00
Simon Kornblith
a5ff752509 -add blackwell synergy translator
-add DOI support to RIS translator
2006-12-16 04:41:08 +00:00
Simon Kornblith
62255639ee - added Ovid translator
- fixed bug with scraping multiple items from within scaffold
2006-12-16 03:29:55 +00:00
Simon Kornblith
17d7b9fe88 - support browse mode in ScienceDirect 2006-12-15 23:52:33 +00:00
Simon Kornblith
baa21e1a6f - added ScienceDirect translator
- fixed a bug in scaffold involving debug logging
2006-12-15 22:46:27 +00:00
Simon Kornblith
875ceea852 closes #449, use library domain in repository field 2006-12-15 20:25:25 +00:00
Simon Kornblith
9192478dd8 addresses #427, add abstract support to translators
adds abstract support to JSTOR and ProQuest translators
adds abstract support to RIS translator
uses dcterms "abstract" property in Zotero RDF for abstract export
2006-12-15 19:47:21 +00:00
Simon Kornblith
3982e1aabf - changed Zotero.Utilities.debug() to Zotero.debug(), for consistency and for the upcoming in-browser translator development tool
- various other preparations
2006-12-15 08:54:31 +00:00
Simon Kornblith
91492910ad addresses #427, Add abstract support to translators.
-translators may now attach abstracts simply by assigning a value to item.abstractNote
-added abstract support to PubMed translator
2006-12-14 22:58:00 +00:00
Simon Kornblith
31535b6d1d oops, need to commit this too. 2006-12-13 05:57:02 +00:00
Simon Kornblith
c6d4cdd57b - closes #391, Second export to same location with attached files fails
- removes extraneous debug code from Zotero RDF export
2006-12-13 05:23:39 +00:00
Simon Kornblith
857f0a907c closes #369, scrapers should store Repository field. the label is automatically used as the repository field, unless a translator explicitly sets the item's repository property to a value. if a translator sets the item's repository property to "false," no value is stored. 2006-12-13 05:05:03 +00:00
Simon Kornblith
6c2c33fc6d - closes #391, second export to same location with attached files fails (I think)
- improves RDF error handling
2006-12-13 03:37:58 +00:00
Simon Kornblith
986fea0b03 -closes #365, update import/export/bibliography to handle new item types. any fields I couldn't find in an existing RDF ontology use the Zotero namespace. we still have to decide if primary creators should be mapped to "author" in the RDF, and translated back out later (currently they aren't).
-adds Zotero.Utilities.getCreatorsForType to Zotero utilities
-makes CiteBase search translator error more gracefully
2006-12-13 03:18:57 +00:00
Simon Kornblith
6e84f20de3 closes #428, line endings missing in imported RIS notes 2006-12-12 17:05:29 +00:00
Simon Kornblith
c5ec016ed9 - closes #327, scrapers should either take snapshots or use URL field
- closes #351, scrapers with PDF downloads should use downloadAssociatedFiles instead of automaticSnapshots

there are some problems with snapshot titles. see bug #436.
2006-12-12 00:28:49 +00:00
Simon Kornblith
0c2ee5d449 closes #406, Incorrect "et al" handling for APA style 2006-12-11 20:59:40 +00:00
Simon Kornblith
6c80c879da - closes #407, error in EBSCOhost translator
- closes #430, Amazon translator causing utilities.js to throw exception
- officially deprecated Zotero.Utilities.getNodeString() (use doc.evaluate and nodeValue or textContent instead, or access attributes directly; these options take the nearly the same amount of code, should be faster, and don't unnecessarily bloat our utilities)
- updated word integration to the latest version
2006-12-11 20:54:22 +00:00
Dan Stillman
58466ef656 BibTeX patch from Patrick Wagstrom on dev list
His note:

1. adds the conference paper item type (currently only exported to BibTeX as inproceedings)
2. Fixes bug with editor names in BibTeX export
3. Provides more intelligent naming for entities in BibTeX exports.  Previously items would be named something like Wagstrom2006, Wagstrom2006-1, etc.  However, I noticed that this ordering could get changed around pretty easily in the export process, resulting in bad references in articles.  We can't really be having that now can we?  The keys are now take the first word of the title, stripping out a few common words.  For example, If I had a paper called "Zoteros impact on time to author scholarly papers", it would have a key of "wagstrom_zotero_2006", which is much more constant. 


There was still an editor field bug after Patrick's patch that I corrected, and author and editor fields seem to be handled properly now.


Also addresses #384, option to prevent escaping of curly brackets in BibTeX output

I believe this patch actually now prevents escaping of curly braces by default, however (according to Simon) it should still be based on a pref or option of some kind
2006-12-09 23:09:14 +00:00
Sean Takats
507efb4758 Replaces SIRSI -2003 and SIRSI 2003+ translators with single SIRSI translator. Handles WebCat, iLink, and iBistro interfaces. Ready for Emory test XPI but needs some refinement to handle other library view preferences as noted in #381 2006-12-07 04:22:57 +00:00
Sean Takats
e0d955afba closes #373 page numbers not captured in PubMed/HubMed 2006-11-29 16:42:09 +00:00
Dan Stillman
9a7c18ed5e Had to repush some scrapers, since apparently phpMyAdmin on the server replaces \n with \r\n on edits, pushing some over the b2.r2 limit 2006-11-27 22:58:36 +00:00
Sean Takats
e391844acd closes #387 year in date field is truncated. 2006-11-27 14:15:27 +00:00
Dan Stillman
f545e6a884 Setting minVersion for Google Scholar and Embedded RDF to 1.0.0b3.r1 2006-11-26 23:53:47 +00:00
Dan Stillman
24ae82b07f Aleph/arXiv/CrossRef/CiteBase pushed to repo 2006-11-26 23:50:58 +00:00
Dan Stillman
361a1e4bc6 Add minVersion/maxVersion to translators schema and schema update mechanisms (local and remote) -- these aren't really necessary on the client but let us use the same SQL to update the repo, and we probably should include them in error reports (instead of relying on different timestamps to differentiate versions)
Added minVersion and maxVersion times to existing scrapers, setting 1.0.0b3.r1 as minVersion for any >4096 characters; these could theoretically now be added back to the repository without problems, but there's not really much reason to test that theory at the moment
2006-11-26 09:19:07 +00:00
Dan Stillman
c8cecf4b7e Pushed updated NYT and Google Books translators to repo
Refs #409, Google Books translator broken after site update
Refs #380, Archived New York Times articles accessed via TimesSelect aren't detected
2006-11-25 19:59:45 +00:00
Sean Takats
88d8f19ece closes #409, google books translator broken after site update 2006-11-25 19:22:33 +00:00
Sean Takats
fc2be5bf21 closes #380 by updating translator regex to run against select.times.com. note that the example article in #380 still will not display the zotero icon or scrape, since that article does not contain the standard meta tags that we use to scrape nytimes articles. other timesselect content now does scrape, however. 2006-11-25 04:07:50 +00:00
Simon Kornblith
38531da9fa closes #396, accents are lost when scraping multiple items (with InnoPAC) 2006-11-25 03:41:13 +00:00
Simon Kornblith
5caf0d2803 made arXiv/eprintweb translator work with lists of recent articles, etc. 2006-11-25 03:16:33 +00:00
Simon Kornblith
e201c3b580 made arXiv translator work with eprintweb as well 2006-11-25 02:53:38 +00:00
Simon Kornblith
05b3cd8566 - added arXiv.org translator
- added CiteBase OpenURL search translator (although CiteBase COinS still won't work, because you can't look most of them up with the CiteBase resolver; ugh)
- fixed Amazon translator type ID (12 -> 4)
2006-11-25 02:13:17 +00:00
Simon Kornblith
94302bbe1c closes #403, Aleph translator not working
i modified the XPath the Aleph translator uses to something that should work in nearly every case.
2006-11-25 00:01:24 +00:00
Sean Takats
6ff2168729 Amazon scraper now supports international Amazon sites and retrieves data from Amazon's API 2006-11-21 21:56:13 +00:00
Simon Kornblith
445ff98277 - made doGet handle multiple urls, with processor/done style interface (as in processDocuments). this should be backwards compatible
- beginnings of mapping for new item types
- fixes for Word integration (because i was using it to write a paper)
2006-11-21 07:14:27 +00:00
Simon Kornblith
a1269146b7 - fixed XML issues with PubMed scraper (although probably not the issue that everyone seems to be experiencing)
- unfinished support for new item types
2006-11-02 00:33:50 +00:00
Dan Stillman
7a3be3e306 Updating SIRSI scraper to last time from repo
(The current repo system is a bit flawed in that translators need to be inserted with CURRENT_TIMESTAMP but scrapers.sql can't be, so scrapers.sql needs to be updated with the repo timestamp after the fact to prevent new installs from unnecessarily grabbing the changed scrapers (or they need to be post-dated to a timestamp after the UTC time of their repository insert but preferably not by more than 24 hours). Suffice it to say, we'll have a more automated solution for this in the future.)
2006-10-25 19:07:11 +00:00
Sean Takats
48659542d3 Updated SIRSI translator to handle author field (not just personal author). 2006-10-25 17:53:17 +00:00
Simon Kornblith
666831748e closes #358, APA style doesn't properly handle references with editors and no authors
closes #348, OpenURL should use only relevant parts of dates
closes #354, Error saving History Cooperative article
closes #356, Embedded Dublin Core scraper incorrectly saves web pages as item type "book"
closes #355, PubMed translator problem
closes #368, RIS/Endnote export hijack doesn't go into active collection
fixes an issue with quotation marks in bibliographies exported as RTF
fixes an issue with bibliographies and non-English locales
2006-10-23 07:34:34 +00:00
Dan Stillman
fab65f743c Eek--bump the scraper version after clearing the tables for upgraders 2006-10-06 15:26:04 +00:00
Dan Stillman
7712a24434 Moved translators and CSL CREATE TABLE statements to userdata.sql, since those are the two tables that we actually _want_ users to modify (without them being wiped on every update) 2006-10-05 23:50:29 +00:00
Dan Stillman
73149b86c7 Add ECL license block to scrapers.sql 2006-10-05 17:29:03 +00:00
Simon Kornblith
cbe7c086e1 closes #336, Some metadata fields are not exported with notes and attachments
closes #165, verify import/export can carry all data for all fields and item types
closes #168, make sure MODS import works with files from external sources
2006-10-05 08:45:44 +00:00
Dan Stillman
cd26267afe Closes #340, Change isInstitution to fieldMode everywhere
Including in the DB, which it turns out isn't really all that bad (thanks, among other things, to SQLite's ability to DROP tables within transactions without autocommitting (which MySQL can't do))
2006-10-05 00:59:26 +00:00
Simon Kornblith
92620afa52 fix a couple of rather inconsequential small bugs 2006-10-04 00:31:29 +00:00
Simon Kornblith
ac50ab16a2 Scholar -> Zotero (thanks Dan S.) 2006-10-04 00:10:35 +00:00
Simon Kornblith
56e77619c4 closes #334, Washington Post scraper shouldn't include " - washingtonpost.com" in title
closes #313, Blacklist known ad sites from scraper detection
closes #306, some New York Times ads prevent page from being recognized
closes #308, attachment import bug

currently, the ad site blacklist is located at the top of ingester/browser.js. at some point, we may want to switch this to a database table.
2006-10-03 22:13:49 +00:00
Simon Kornblith
96ccf85aba - improve CSL
- tag institutional authors appropriately
2006-10-03 21:08:02 +00:00
Dan Stillman
1cd51be497 Sorry, it was now or never, and now is better:
Changed "Scholar" to "Zotero", everywhere

Apologies to anyone with working copy changes, but there are probably the fewer at this moment than there will be again.

Hopefully this won't break anything, though existing prefs will be lost. I avoided scholar.google.com--if you know any other legitimate "scholar"s in the code, be sure to fix them once I'm done here.

This is a multi-commit change--there's at least one more coming. *Do not update to this version! It won't work!*
2006-10-02 23:15:27 +00:00
Dan Stillman
eccc2159c1 Oops--CSL table needs to be defined in scrapers.sql too.
(The problem with the current system is that any local translators or styles will be wiped out on upgrades (though not auto-updates), but the solution for that is probably to just offer an SQL file that the user can put custom SQL statements in to be run on upgrades (sorta the same idea as user.js in Firefox). Will deal with that at a later date, though.)
2006-10-02 21:25:47 +00:00