Fixes #377, Problems scraping from Hubmed/PubMed

Fixes #381, SIRSI scraper no longer working at William & Mary And new Amazon scraper. And a few COinS errors. And possibly some others. It turns out Firefox has a bug in which DOM nodeValues greater than 4096 characters are split into multiple nodes, and so any scrapers pulled from the repository with 'code' fields greater than 4K were being truncated. We didn't see it during testing of repo code because most are smaller. Calling normalize() on the node combines the nodes, so future releases won't have the problem regardless of when it's fixed in Firefox. For existing installs, I managed to get PubMed, COinS, SIRSI 2003+, and, with quite a lot of effort, Amazon, under 4096 characters, hopefully without breaking anything. I removed all other scrapers from the repository for now.
2006-11-24 06:09:17 +00:00 · 2006-11-24 06:09:17 +00:00 · cc8ef0b93d
commit cc8ef0b93d
parent 7a1339158a
1 changed files with 4 additions and 0 deletions
--- a/chrome/content/zotero/xpcom/schema.js
+++ b/chrome/content/zotero/xpcom/schema.js
@ -526,6 +526,10 @@ Zotero.Schema = new function(){
 	* update the local scrapers table with the scraper data
 	**/
 	function _translatorXMLToDB(xmlnode){
+		// Don't split >4K chunks into multiple nodes
+		// https://bugzilla.mozilla.org/show_bug.cgi?id=194231
+		xmlnode.normalize();
+		
 		var sqlValues = [
 			{string: xmlnode.getAttribute('id')},
 			{string: xmlnode.getAttribute('lastUpdated')},