Initial commit
38
README
Normal file
|
@ -0,0 +1,38 @@
|
|||
this suite of tools can be used to retrieve a local copy
|
||||
from the FreeCAD wiki and then use it to generate qhelp
|
||||
and pdf files. The downloading of the entire wiki is now
|
||||
a huge operation, prone to network errors, so it has been
|
||||
cut into 2 parts, one to retrieve a list of files to
|
||||
download and another to actually download the files.
|
||||
|
||||
1) run "buildwikiindex.py" to build an index file containing
|
||||
a list of all the files to download
|
||||
|
||||
2) run "downloadwiki.py". If connection drops, run it again,
|
||||
the already downloaded files will be skipped.
|
||||
|
||||
3) run "buildqhelp.py" to generate freecad.qhc and freecad.qch
|
||||
files
|
||||
|
||||
4) run "buildpdf.py" to generate freecad.pdf (wkhtmltopdf must be installed)
|
||||
|
||||
5) the qhelp files can be tested with "assistant -collectionFile freecad.qhc"
|
||||
|
||||
6) If you have already downloaded the whole wiki, run "update.py" immediately
|
||||
after, to create a list of revision IDs for each page.
|
||||
|
||||
7) Once the initial revisions list has been created, the "update.py" script
|
||||
can be ran anytime in the future, to check for pages that have changed
|
||||
since the stored revision ID. The script is meant to run twice, one to get
|
||||
a list of pages that have changed, and another one to download the changed
|
||||
pages (and all their dependencies) again.
|
||||
|
||||
8) To split the generated freecad.qch into parts that are smaller than 50Mb
|
||||
(github limit): split -d --byte=49M localwiki/freecad.qch localwiki/freecad.qch.part
|
||||
|
||||
9) To join the parts again (for testing): cat localwiki/freecad.qch.part* >> test.qch
|
||||
Then check that test.qch has the same md5 number than localwiki/freecad.qch
|
||||
|
||||
10) To test: assistant -collectionFile localwiki/freecad.qhc
|
||||
|
||||
|
557
buildpdf.py
Executable file
|
@ -0,0 +1,557 @@
|
|||
#!/usr/bin/env python
|
||||
|
||||
#***************************************************************************
|
||||
#* *
|
||||
#* Copyright (c) 2009 Yorik van Havre <yorik@uncreated.net> *
|
||||
#* *
|
||||
#* This program is free software; you can redistribute it and/or modify *
|
||||
#* it under the terms of the GNU Lesser General Public License (LGPL) *
|
||||
#* as published by the Free Software Foundation; either version 2 of *
|
||||
#* the License, or (at your option) any later version. *
|
||||
#* for detail see the LICENCE text file. *
|
||||
#* *
|
||||
#* This program is distributed in the hope that it will be useful, *
|
||||
#* but WITHOUT ANY WARRANTY; without even the implied warranty of *
|
||||
#* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the *
|
||||
#* GNU Library General Public License for more details. *
|
||||
#* *
|
||||
#* You should have received a copy of the GNU Library General Public *
|
||||
#* License along with this program; if not, write to the Free Software *
|
||||
#* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 *
|
||||
#* USA *
|
||||
#* *
|
||||
#***************************************************************************
|
||||
|
||||
__title__="buildpdf"
|
||||
__author__ = "Yorik van Havre <yorik@uncreated.net>"
|
||||
__url__ = "http://www.freecadweb.org"
|
||||
|
||||
"""
|
||||
This script builds a pdf file from a local copy of the wiki
|
||||
"""
|
||||
|
||||
TOC="""Online_Help_Startpage
|
||||
About_FreeCAD
|
||||
Feature_list
|
||||
Installing
|
||||
Getting_started
|
||||
Mouse_Model
|
||||
Document_structure
|
||||
Property_editor
|
||||
Import_Export
|
||||
Workbenches
|
||||
|
||||
begin
|
||||
|
||||
Part_Workbench
|
||||
Part_Box
|
||||
Part_Cone
|
||||
Part_Cylinder
|
||||
Part_Sphere
|
||||
Part_Torus
|
||||
Part_CreatePrimitives
|
||||
Part_Plane
|
||||
Part_Prism
|
||||
Part_Wedge
|
||||
Part_Helix
|
||||
Part_Spiral
|
||||
Part_Circle
|
||||
Part_Ellipse
|
||||
Part_Line
|
||||
Part_Point
|
||||
Part_RegularPolygon
|
||||
Part_Booleans
|
||||
# Part_Common
|
||||
# Part_Cut
|
||||
Part_Fuse
|
||||
# Part_Shapebuilder
|
||||
Part_Extrude
|
||||
Part_Fillet
|
||||
Part_Revolve
|
||||
Part_SectionCross
|
||||
Part_Chamfer
|
||||
Part_Mirror
|
||||
Part_RuledSurface
|
||||
Part_Sweep
|
||||
Part_Loft
|
||||
Part_Offset
|
||||
Part_Thickness
|
||||
Part_RefineShape
|
||||
Part_CheckGeometry
|
||||
|
||||
begin
|
||||
|
||||
PartDesign_Workbench
|
||||
PartDesign_Pad
|
||||
PartDesign_Pocket
|
||||
PartDesign_Revolution
|
||||
PartDesign_Groove
|
||||
|
||||
Sketcher_Point
|
||||
Sketcher_Line
|
||||
Sketcher_Arc
|
||||
Sketcher_Circle
|
||||
Sketcher_Ellipse
|
||||
Sketcher_Arc_of_Ellipse
|
||||
Sketcher_Polyline
|
||||
Sketcher_Rectangle
|
||||
Sketcher_Triangle
|
||||
Sketcher_Square
|
||||
Sketcher_Pentagon
|
||||
Sketcher_Hexagon
|
||||
Sketcher_Heptagon
|
||||
Sketcher_Octagon
|
||||
Sketcher_Slot
|
||||
Sketcher_Fillet
|
||||
Sketcher_Trimming
|
||||
# Sketcher_Arc3Point
|
||||
# Sketcher_Circle3Point
|
||||
# Sketcher_ConicSections
|
||||
# Sketcher_Ellipse_by_3_Points
|
||||
|
||||
Constraint_PointOnPoint
|
||||
Constraint_Vertical
|
||||
Constraint_Horizontal
|
||||
Constraint_Parallel
|
||||
Constraint_Perpendicular
|
||||
Constraint_Tangent
|
||||
Constraint_EqualLength
|
||||
Constraint_Symmetric
|
||||
Constraint_Lock
|
||||
Constraint_HorizontalDistance
|
||||
Constraint_VerticalDistance
|
||||
Constraint_Length
|
||||
Constraint_Radius
|
||||
Constraint_InternalAngle
|
||||
Constraint_SnellsLaw
|
||||
Constraint_Internal_Alignment
|
||||
# Constraint_PointOnObject
|
||||
|
||||
Sketcher_MapSketch
|
||||
Sketcher_Reorient
|
||||
Sketcher_Validate
|
||||
Sketcher_Show_Hide_Internal_Geometry
|
||||
# Sketcher_MergeSketch
|
||||
Sketcher_CloseShape
|
||||
Sketcher_ConnectLines
|
||||
# Sketcher_SelectConstraints
|
||||
# Sketcher_SelectOrigin
|
||||
# Sketcher_SelectVerticalAxis
|
||||
# Sketcher_SelectHorizontalAxis
|
||||
# Sketcher_SelectRedundantConstraints
|
||||
# Sketcher_SelectConflictingConstraints
|
||||
# Sketcher_SelectElementsAssociatedWithConstraints
|
||||
|
||||
PartDesign_Fillet
|
||||
PartDesign_Chamfer
|
||||
PartDesign_Draft
|
||||
PartDesign_Mirrored
|
||||
PartDesign_LinearPattern
|
||||
PartDesign_PolarPattern
|
||||
PartDesign_Scaled
|
||||
PartDesign_MultiTransform
|
||||
PartDesign_WizardShaft
|
||||
PartDesign_InvoluteGear
|
||||
|
||||
Sketcher_Tutorial
|
||||
|
||||
begin
|
||||
|
||||
Draft_Workbench
|
||||
Draft_Line
|
||||
Draft_Wire
|
||||
Draft_Circle
|
||||
Draft_Arc
|
||||
Draft_Ellipse
|
||||
Draft_Polygon
|
||||
Draft_Rectangle
|
||||
Draft_Text
|
||||
Draft_Dimension
|
||||
Draft_BSpline
|
||||
Draft_Point
|
||||
Draft_ShapeString
|
||||
Draft_Facebinder
|
||||
Draft_BezCurve
|
||||
Draft_Move
|
||||
Draft_Rotate
|
||||
Draft_Offset
|
||||
Draft_Trimex
|
||||
Draft_Upgrade
|
||||
Draft_Downgrade
|
||||
Draft_Scale
|
||||
Draft_Edit
|
||||
Draft_WireToBSpline
|
||||
Draft_AddPoint
|
||||
Draft_DelPoint
|
||||
Draft_Shape2DView
|
||||
Draft_Draft2Sketch
|
||||
Draft_Array
|
||||
Draft_Clone
|
||||
Draft_SelectPlane
|
||||
Draft_VisGroup
|
||||
|
||||
begin
|
||||
|
||||
Arch_Workbench
|
||||
Arch_Wall
|
||||
Arch_Structure
|
||||
Arch_Rebar
|
||||
Arch_Floor
|
||||
Arch_Building
|
||||
Arch_Site
|
||||
Arch_Window
|
||||
Arch_SectionPlane
|
||||
Arch_Axis
|
||||
Arch_Roof
|
||||
Arch_Space
|
||||
Arch_Stairs
|
||||
Arch_Panel
|
||||
Arch_Frame
|
||||
Arch_Equipment
|
||||
Arch_CutPlane
|
||||
Arch_Add
|
||||
Arch_Remove
|
||||
Arch_Survey
|
||||
Arch_tutorial
|
||||
|
||||
begin
|
||||
|
||||
Drawing_Workbench
|
||||
Drawing_Landscape_A3
|
||||
Drawing_View
|
||||
Drawing_Annotation
|
||||
Drawing_Clip
|
||||
Drawing_Openbrowser
|
||||
Drawing_Symbol
|
||||
Drawing_DraftView
|
||||
Drawing_Save
|
||||
Drawing_ProjectShape
|
||||
# Drawing_Othoviews
|
||||
|
||||
begin
|
||||
|
||||
Raytracing_Workbench
|
||||
# Raytracing_New
|
||||
# Raytracing_Lux
|
||||
# Raytracing_Part
|
||||
# Raytracing_ResetCamera
|
||||
# Raytracing_Export
|
||||
# Raytracing_Render
|
||||
|
||||
begin
|
||||
|
||||
Robot_Workbench
|
||||
# Robot_createRobot
|
||||
# Robot_Simulate
|
||||
# Robot_Export
|
||||
# Robot_SetHomePos
|
||||
# Robot_RestoreHomePos
|
||||
# Robot_CreateTrajectory
|
||||
# Robot_SetDefaultOrientation
|
||||
# Robot_InsertWaypoint
|
||||
# Robot_InsertWaypointPre
|
||||
# Robot_Edge2Trac
|
||||
# Robot_TrajectoryDressUp
|
||||
# Robot_TrajectoryCompound
|
||||
|
||||
begin
|
||||
|
||||
OpenSCAD_Workbench
|
||||
OpenSCAD_AddOpenSCADElement
|
||||
# OpenSCAD_ColorCodeShape
|
||||
# OpenSCAD_ReplaceObject
|
||||
# OpenSCAD_RemoveSubtree
|
||||
# OpenSCAD_RefineShapeFeature
|
||||
# OpenSCAD_IncreaseTolerance
|
||||
# OpenSCAD_Edgestofaces
|
||||
# OpenSCAD_ExpandPlacements
|
||||
# OpenSCAD_ExplodeGroup
|
||||
# OpenSCAD_MeshBoolean
|
||||
# OpenSCAD_Hull
|
||||
# OpenSCAD_Minkowski
|
||||
|
||||
begin
|
||||
|
||||
Fem_Workbench
|
||||
FEM_Analysis
|
||||
# FEM_Solver
|
||||
# FEM_Create
|
||||
# FEM_Material
|
||||
# FEM_Calculation
|
||||
# FEM_DefineNodes
|
||||
# FEM_FixedConstraint
|
||||
# FEM_ForceConstraint
|
||||
# FEM_BearingConstraint
|
||||
# FEM_GearConstraint
|
||||
# FEM_PulleyConstraint
|
||||
# FEM_ShowResult
|
||||
|
||||
begin
|
||||
|
||||
Plot_Module
|
||||
Plot_Save
|
||||
Plot_Basic_tutorial
|
||||
Plot_MultiAxes_tutorial
|
||||
# Plot_Axes
|
||||
# Plot_Series
|
||||
# Plot_Grid
|
||||
# Plot_Legend
|
||||
# Plot_Labels
|
||||
# Plot_Positions
|
||||
|
||||
begin
|
||||
|
||||
Mesh_Workbench
|
||||
|
||||
end
|
||||
|
||||
Interface_Customization
|
||||
Preferences_Editor
|
||||
Macros
|
||||
Introduction_to_Python
|
||||
Python_scripting_tutorial
|
||||
Topological_data_scripting
|
||||
Mesh_Scripting
|
||||
Mesh_to_Part
|
||||
Scenegraph
|
||||
Pivy
|
||||
|
||||
begin
|
||||
|
||||
PySide
|
||||
PySide_Beginner_Examples
|
||||
PySide_Medium_Examples
|
||||
PySide_Advanced_Examples
|
||||
|
||||
end
|
||||
|
||||
Scripted_objects
|
||||
Embedding_FreeCAD
|
||||
Embedding_FreeCADGui
|
||||
Code_snippets"""
|
||||
|
||||
import sys, os, re, tempfile, getopt, shutil, time
|
||||
from urllib2 import urlopen, HTTPError
|
||||
|
||||
# CONFIGURATION #################################################
|
||||
|
||||
INDEX = "Online_Help_Toc" # the start page from where to crawl the wiki
|
||||
PDFCONVERTOR = 'wkhtmltopdf' # can be 'pisa', 'htmldoc', 'wkhtmltopdf' or 'firefox'
|
||||
VERBOSE = True # set true to get output messages
|
||||
INCLUDECOMMANDS = True # if true, the command pages of each workbench are included after each WB page
|
||||
OVERWRITE = False # if true, pdf files are recreated even if already existing
|
||||
FIREFOXPDFFOLDER = os.path.expanduser("~")+os.sep+"PDF" # if firefox is used, set this to where it places its pdf files by default
|
||||
COVER = "http://www.freecadweb.org/wiki/images/7/79/Freecad-pdf-cover.svg"
|
||||
|
||||
# END CONFIGURATION ##############################################
|
||||
|
||||
|
||||
FOLDER = "./localwiki"
|
||||
|
||||
fcount = dcount = 0
|
||||
|
||||
def crawl():
|
||||
"creates a pdf file from the localwiki folder"
|
||||
|
||||
# tests ###############################################
|
||||
|
||||
if PDFCONVERTOR == 'pisa':
|
||||
try:
|
||||
import ho.pisa as pisa
|
||||
except:
|
||||
"Error: Python-pisa not installed, exiting."
|
||||
return 1
|
||||
elif PDFCONVERTOR == 'htmldoc':
|
||||
if os.system('htmldoc --version'):
|
||||
print "Error: Htmldoc not found, exiting."
|
||||
return 1
|
||||
try:
|
||||
from PyPDF2 import PdfFileReader,PdfFileWriter
|
||||
except:
|
||||
print "Error: Python-pypdf2 not installed, exiting."
|
||||
|
||||
# run ########################################################
|
||||
|
||||
buildpdffiles()
|
||||
joinpdf()
|
||||
|
||||
if VERBOSE: print "All done!"
|
||||
return 0
|
||||
|
||||
|
||||
def buildpdffiles():
|
||||
"scans a folder for html files and converts them all to pdf"
|
||||
templist = os.listdir(FOLDER)
|
||||
if PDFCONVERTOR == 'wkhtmltopdf':
|
||||
makeStyleSheet()
|
||||
global fileslist
|
||||
fileslist = []
|
||||
for i in templist:
|
||||
if i[-5:] == '.html':
|
||||
fileslist.append(i)
|
||||
print "converting ",len(fileslist)," pages"
|
||||
i = 1
|
||||
for f in fileslist:
|
||||
print i," : ",f
|
||||
if PDFCONVERTOR == 'pisa':
|
||||
createpdf_pisa(f[:-5])
|
||||
elif PDFCONVERTOR == 'wkhtmltopdf':
|
||||
createpdf_wkhtmltopdf(f[:-5])
|
||||
elif PDFCONVERTOR == 'firefox':
|
||||
createpdf_firefox(f[:-5])
|
||||
else:
|
||||
createpdf_htmldoc(f[:-5])
|
||||
i += 1
|
||||
|
||||
|
||||
def fetch_resources(uri, rel):
|
||||
"""
|
||||
Callback to allow pisa/reportlab to retrieve Images,Stylesheets, etc.
|
||||
'uri' is the href attribute from the html link element.
|
||||
'rel' gives a relative path, but it's not used here.
|
||||
|
||||
Note from Yorik: Not working!!
|
||||
"""
|
||||
path = os.path.join(FOLDER,uri.replace("./", ""))
|
||||
return path
|
||||
|
||||
def createpdf_pisa(pagename):
|
||||
"creates a pdf file from a saved page using pisa (python module)"
|
||||
import ho.pisa as pisa
|
||||
if (not exists(pagename+".pdf",image=True)) or OVERWRTIE:
|
||||
infile = open(FOLDER + os.sep + pagename+'.html','ro')
|
||||
outfile = open(FOLDER + os.sep + pagename+'.pdf','wb')
|
||||
if VERBOSE: print "Converting " + pagename + " to pdf..."
|
||||
pdf = pisa.CreatePDF(infile,outfile,FOLDER,link_callback=fetch_resources)
|
||||
outfile.close()
|
||||
if pdf.err:
|
||||
return pdf.err
|
||||
return 0
|
||||
|
||||
|
||||
def createpdf_firefox(pagename):
|
||||
"creates a pdf file from a saved page using firefox (needs command line printing extension)"
|
||||
# the default printer will be used, so make sure it is set to pdf
|
||||
# command line printing extension http://forums.mozillazine.org/viewtopic.php?f=38&t=2729795
|
||||
if (not exists(pagename+".pdf",image=True)) or OVERWRITE:
|
||||
infile = FOLDER + os.sep + pagename+'.html'
|
||||
outfile = FOLDER + os.sep + pagename+'.pdf'
|
||||
return os.system('firefox -print ' + infile)
|
||||
time.sleep(6)
|
||||
if os.path.exists(FIREFOXPDFFOLDER + os.sep + pagename + ".pdf"):
|
||||
shutil.move(FIREFOXPDFFOLDER+os.sep+pagename+".pdf",outfile)
|
||||
else:
|
||||
print "-----------------------------------------> Couldn't find print output!"
|
||||
|
||||
|
||||
def createpdf_htmldoc(pagename):
|
||||
"creates a pdf file from a saved page using htmldoc (external app, but supports images)"
|
||||
if (not exists(pagename+".pdf",image=True)) or OVERWRITE:
|
||||
infile = FOLDER + os.sep + pagename+'.html'
|
||||
outfile = FOLDER + os.sep + pagename+'.pdf'
|
||||
return os.system('htmldoc --webpage --textfont sans --browserwidth 840 -f '+outfile+' '+infile)
|
||||
|
||||
|
||||
def createpdf_wkhtmltopdf(pagename):
|
||||
"creates a pdf file from a saved page using htmldoc (external app, but supports images)"
|
||||
if (not exists(pagename+".pdf",image=True)) or OVERWRITE:
|
||||
infile = FOLDER + os.sep + pagename+'.html'
|
||||
outfile = FOLDER + os.sep + pagename+'.pdf'
|
||||
cmd = 'wkhtmltopdf -L 5mm --user-style-sheet '+FOLDER+os.sep+'wkhtmltopdf.css '+infile+' '+outfile
|
||||
print cmd
|
||||
#return os.system(cmd)
|
||||
else:
|
||||
print "skipping"
|
||||
|
||||
|
||||
def joinpdf():
|
||||
"creates one pdf file from several others, following order from the cover"
|
||||
from PyPDF2 import PdfFileReader,PdfFileWriter
|
||||
if VERBOSE: print "Building table of contents..."
|
||||
|
||||
result = PdfFileWriter()
|
||||
createCover()
|
||||
inputfile = PdfFileReader(open(FOLDER+os.sep+'Cover.pdf','rb'))
|
||||
result.addPage(inputfile.getPage(0))
|
||||
count = 1
|
||||
|
||||
tocfile = TOC.split("\n")
|
||||
parent = False
|
||||
for page in tocfile:
|
||||
page = page.strip()
|
||||
if page:
|
||||
if page[0] == "#":
|
||||
continue
|
||||
if page == "begin":
|
||||
parent = True
|
||||
continue
|
||||
if page == "end":
|
||||
parent = False
|
||||
continue
|
||||
if VERBOSE: print 'Appending',page, "at position",count
|
||||
title = page.replace("_"," ")
|
||||
pdffile = page + ".pdf"
|
||||
if exists(pdffile,True):
|
||||
inputfile = PdfFileReader(open(FOLDER + os.sep + pdffile,'rb'))
|
||||
numpages = inputfile.getNumPages()
|
||||
for i in range(numpages):
|
||||
result.addPage(inputfile.getPage(i))
|
||||
if parent == True:
|
||||
parent = result.addBookmark(title,count)
|
||||
elif parent == False:
|
||||
result.addBookmark(title,count)
|
||||
else:
|
||||
result.addBookmark(title,count,parent)
|
||||
count += numpages
|
||||
else:
|
||||
print "page",pdffile,"not found, aborting."
|
||||
sys.exit()
|
||||
|
||||
if VERBOSE: print "Writing..."
|
||||
outputfile = open(FOLDER+os.sep+"freecad.pdf",'wb')
|
||||
result.write(outputfile)
|
||||
outputfile.close()
|
||||
if VERBOSE:
|
||||
print ' '
|
||||
print 'Successfully created '+FOLDER+os.sep+'freecad.pdf'
|
||||
|
||||
|
||||
def local(page,image=False):
|
||||
"returns a local path for a given page/image"
|
||||
if image:
|
||||
return FOLDER + os.sep + page
|
||||
else:
|
||||
return FOLDER + os.sep + page + '.html'
|
||||
|
||||
|
||||
def exists(page,image=False):
|
||||
"checks if given page/image already exists"
|
||||
path = local(page,image)
|
||||
if os.path.exists(path): return True
|
||||
return False
|
||||
|
||||
|
||||
def makeStyleSheet():
|
||||
"Creates a stylesheet for wkhtmltopdf"
|
||||
outputfile = open(FOLDER+os.sep+"wkhtmltopdf.css",'wb')
|
||||
outputfile.write("""
|
||||
.printfooter {
|
||||
display:none !important;
|
||||
}
|
||||
""")
|
||||
outputfile.close()
|
||||
|
||||
|
||||
def createCover():
|
||||
"downloads and creates a cover page"
|
||||
if VERBOSE: print "fetching " + COVER
|
||||
data = (urlopen(COVER).read())
|
||||
path = FOLDER + os.sep + "Cover.svg"
|
||||
fil = open(path,'wb')
|
||||
fil.write(data)
|
||||
fil.close()
|
||||
os.system('inkscape --export-pdf='+FOLDER+os.sep+'Cover.pdf'+' '+FOLDER+os.sep+'Cover.svg')
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
crawl()
|
237
buildqhelp.py
Executable file
|
@ -0,0 +1,237 @@
|
|||
#!/usr/bin/env python
|
||||
|
||||
#***************************************************************************
|
||||
#* *
|
||||
#* Copyright (c) 2009 Yorik van Havre <yorik@uncreated.net> *
|
||||
#* *
|
||||
#* This program is free software; you can redistribute it and/or modify *
|
||||
#* it under the terms of the GNU Lesser General Public License (LGPL) *
|
||||
#* as published by the Free Software Foundation; either version 2 of *
|
||||
#* the License, or (at your option) any later version. *
|
||||
#* for detail see the LICENCE text file. *
|
||||
#* *
|
||||
#* This program is distributed in the hope that it will be useful, *
|
||||
#* but WITHOUT ANY WARRANTY; without even the implied warranty of *
|
||||
#* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the *
|
||||
#* GNU Library General Public License for more details. *
|
||||
#* *
|
||||
#* You should have received a copy of the GNU Library General Public *
|
||||
#* License along with this program; if not, write to the Free Software *
|
||||
#* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 *
|
||||
#* USA *
|
||||
#* *
|
||||
#***************************************************************************
|
||||
|
||||
__title__="wiki2qhelp"
|
||||
__author__ = "Yorik van Havre <yorik@uncreated.net>"
|
||||
__url__ = "http://www.freecadweb.org"
|
||||
|
||||
"""
|
||||
This script builds qhrlp files from a local copy of the wiki
|
||||
"""
|
||||
|
||||
import sys, os, re, tempfile, getopt, shutil
|
||||
from urllib2 import urlopen, HTTPError
|
||||
|
||||
# CONFIGURATION #################################################
|
||||
|
||||
FOLDER = "./localwiki"
|
||||
INDEX = "Online_Help_Toc" # the start page from where to crawl the wiki
|
||||
VERBOSE = True # to display what's going on. Otherwise, runs totally silent.
|
||||
QHELPCOMPILER = 'qhelpgenerator'
|
||||
QCOLLECTIOMGENERATOR = 'qcollectiongenerator'
|
||||
RELEASE = '0.17'
|
||||
|
||||
# END CONFIGURATION ##############################################
|
||||
|
||||
fcount = dcount = 0
|
||||
|
||||
def crawl():
|
||||
"downloads an entire wiki site"
|
||||
|
||||
# tests ###############################################
|
||||
|
||||
if os.system(QHELPCOMPILER +' -v'):
|
||||
print "Error: QAssistant not fully installed, exiting."
|
||||
return 1
|
||||
if os.system(QCOLLECTIOMGENERATOR +' -v'):
|
||||
print "Error: QAssistant not fully installed, exiting."
|
||||
return 1
|
||||
|
||||
# run ########################################################
|
||||
|
||||
qhp = buildtoc()
|
||||
qhcp = createCollProjectFile()
|
||||
shutil.copy("freecad-icon-64.png","localwiki/freecad-icon-64.png")
|
||||
if generate(qhcp) or compile(qhp):
|
||||
print "Error at compiling"
|
||||
return 1
|
||||
if VERBOSE: print "All done!"
|
||||
i=raw_input("Copy the files to their correct location in the source tree? y/n (default=no) ")
|
||||
if i.upper() in ["Y","YES"]:
|
||||
shutil.copy("localwiki/freecad.qch","../../Doc/freecad.qch")
|
||||
shutil.copy("localwiki/freecad.qhc","../../Doc/freecad.qhc")
|
||||
else:
|
||||
print 'Files are in localwiki. Test with "assistant -collectionFile localwiki/freecad.qhc"'
|
||||
return 0
|
||||
|
||||
def compile(qhpfile):
|
||||
"compiles the whole html doc with qassistant"
|
||||
qchfile = FOLDER + os.sep + "freecad.qch"
|
||||
if not os.system(QHELPCOMPILER + ' '+qhpfile+' -o '+qchfile):
|
||||
if VERBOSE: print "Successfully created",qchfile
|
||||
return 0
|
||||
|
||||
def generate(qhcpfile):
|
||||
"generates qassistant-specific settings like icon, title, ..."
|
||||
txt="""
|
||||
<center>FreeCAD """+RELEASE+""" help files<br/>
|
||||
<a href="http://www.freecadweb.org">http://www.freecadweb.org</a></center>
|
||||
"""
|
||||
about=open(FOLDER + os.sep + "about.txt","w")
|
||||
about.write(txt)
|
||||
about.close()
|
||||
qhcfile = FOLDER + os.sep + "freecad.qhc"
|
||||
if not os.system(QCOLLECTIOMGENERATOR+' '+qhcpfile+' -o '+qhcfile):
|
||||
if VERBOSE: print "Successfully created ",qhcfile
|
||||
return 0
|
||||
|
||||
def createCollProjectFile():
|
||||
qprojectfile = '''<?xml version="1.0" encoding="UTF-8"?>
|
||||
<QHelpCollectionProject version="1.0">
|
||||
<assistant>
|
||||
<title>FreeCAD User Manual</title>
|
||||
<applicationIcon>freecad-icon-64.png</applicationIcon>
|
||||
<cacheDirectory>freecad/freecad</cacheDirectory>
|
||||
<startPage>qthelp://org.freecad.usermanual/doc/Online_Help_Startpage.html</startPage>
|
||||
<aboutMenuText>
|
||||
<text>About FreeCAD</text>
|
||||
</aboutMenuText>
|
||||
<aboutDialog>
|
||||
<file>about.txt</file>
|
||||
<!--
|
||||
<icon>images/icon.png</icon>
|
||||
-->
|
||||
<icon>freecad-icon-64.png</icon>
|
||||
</aboutDialog>
|
||||
<enableDocumentationManager>true</enableDocumentationManager>
|
||||
<enableAddressBar>true</enableAddressBar>
|
||||
<enableFilterFunctionality>true</enableFilterFunctionality>
|
||||
</assistant>
|
||||
<docFiles>
|
||||
<generate>
|
||||
<file>
|
||||
<input>freecad.qhp</input>
|
||||
<output>freecad.qch</output>
|
||||
</file>
|
||||
</generate>
|
||||
<register>
|
||||
<file>freecad.qch</file>
|
||||
</register>
|
||||
</docFiles>
|
||||
</QHelpCollectionProject>
|
||||
'''
|
||||
if VERBOSE: print "Building project file..."
|
||||
qfilename = FOLDER + os.sep + "freecad.qhcp"
|
||||
f = open(qfilename,'w')
|
||||
f.write(qprojectfile)
|
||||
f.close()
|
||||
if VERBOSE: print "Done writing qhcp file",qfilename
|
||||
return qfilename
|
||||
|
||||
def buildtoc():
|
||||
'''
|
||||
gets the table of contents page and parses its
|
||||
contents into a clean lists structure
|
||||
'''
|
||||
|
||||
qhelpfile = '''<?xml version="1.0" encoding="UTF-8"?>
|
||||
<QtHelpProject version="1.0">
|
||||
<namespace>org.freecad.usermanual</namespace>
|
||||
<virtualFolder>doc</virtualFolder>
|
||||
<!--
|
||||
<customFilter name="FreeCAD '''+RELEASE+'''">
|
||||
<filterAttribute>FreeCAD</filterAttribute>
|
||||
<filterAttribute>'''+RELEASE+'''</filterAttribute>
|
||||
</customFilter>
|
||||
-->
|
||||
<filterSection>
|
||||
<!--
|
||||
<filterAttribute>FreeCAD</filterAttribute>
|
||||
<filterAttribute>'''+RELEASE+'''</filterAttribute>
|
||||
-->
|
||||
<toc>
|
||||
<inserttoc>
|
||||
</toc>
|
||||
<keywords>
|
||||
<insertkeywords>
|
||||
</keywords>
|
||||
<insertfiles>
|
||||
</filterSection>
|
||||
</QtHelpProject>
|
||||
'''
|
||||
|
||||
def getname(line):
|
||||
line = re.compile('<li>').sub('',line)
|
||||
line = re.compile('</li>').sub('',line)
|
||||
title = line.strip()
|
||||
link = ''
|
||||
if "<a" in line:
|
||||
title = re.findall('<a[^>]*>(.*?)</a>',line)[0].strip()
|
||||
link = re.findall('href="(.*?)"',line)[0].strip()
|
||||
if not link: link = 'default.html'
|
||||
return title,link
|
||||
|
||||
if VERBOSE: print "Building table of contents..."
|
||||
f = open(FOLDER+os.sep+INDEX+'.html')
|
||||
html = ''
|
||||
for line in f: html += line
|
||||
f.close()
|
||||
html = html.replace("\n"," ")
|
||||
html = html.replace("> <","><")
|
||||
html = re.findall("<ul.*/ul>",html)[0]
|
||||
items = re.findall('<li[^>]*>.*?</li>|</ul></li>',html)
|
||||
inserttoc = '<section title="FreeCAD Documentation" ref="Online_Help_Toc.html">\n'
|
||||
insertkeywords = ''
|
||||
for item in items:
|
||||
if not ("<ul>" in item):
|
||||
if ("</ul>" in item):
|
||||
inserttoc += '</section>\n'
|
||||
else:
|
||||
link = ''
|
||||
title,link=getname(item)
|
||||
if link:
|
||||
link='" ref="'+link
|
||||
insertkeywords += ('<keyword name="'+title+link+'"/>\n')
|
||||
inserttoc += ('<section title="'+title+link+'"></section>\n')
|
||||
else:
|
||||
subitems = item.split("<ul>")
|
||||
for i in range(len(subitems)):
|
||||
link = ''
|
||||
title,link=getname(subitems[i])
|
||||
if link:
|
||||
link='" ref="'+link
|
||||
insertkeywords += ('<keyword name="'+title+link+'"/>\n')
|
||||
trail = ''
|
||||
if i == len(subitems)-1: trail = '</section>'
|
||||
inserttoc += ('<section title="'+title+link+'">'+trail+'\n')
|
||||
inserttoc += '</section>\n'
|
||||
|
||||
insertfiles = "<files>\n"
|
||||
for fil in os.listdir(FOLDER):
|
||||
insertfiles += ("<file>"+fil+"</file>\n")
|
||||
insertfiles += "</files>\n"
|
||||
|
||||
qhelpfile = re.compile('<insertkeywords>').sub(insertkeywords,qhelpfile)
|
||||
qhelpfile = re.compile('<inserttoc>').sub(inserttoc,qhelpfile)
|
||||
qhelpfile = re.compile('<insertfiles>').sub(insertfiles,qhelpfile)
|
||||
qfilename = FOLDER + os.sep + "freecad.qhp"
|
||||
f = open(qfilename,'wb')
|
||||
f.write(qhelpfile)
|
||||
f.close()
|
||||
if VERBOSE: print "Done writing qhp file",qfilename
|
||||
return qfilename
|
||||
|
||||
if __name__ == "__main__":
|
||||
crawl()
|
||||
|
200
buildwikiindex.py
Executable file
|
@ -0,0 +1,200 @@
|
|||
#!/usr/bin/env python
|
||||
|
||||
#***************************************************************************
|
||||
#* *
|
||||
#* Copyright (c) 2009 Yorik van Havre <yorik@uncreated.net> *
|
||||
#* *
|
||||
#* This program is free software; you can redistribute it and/or modify *
|
||||
#* it under the terms of the GNU Lesser General Public License (LGPL) *
|
||||
#* as published by the Free Software Foundation; either version 2 of *
|
||||
#* the License, or (at your option) any later version. *
|
||||
#* for detail see the LICENCE text file. *
|
||||
#* *
|
||||
#* This program is distributed in the hope that it will be useful, *
|
||||
#* but WITHOUT ANY WARRANTY; without even the implied warranty of *
|
||||
#* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the *
|
||||
#* GNU Library General Public License for more details. *
|
||||
#* *
|
||||
#* You should have received a copy of the GNU Library General Public *
|
||||
#* License along with this program; if not, write to the Free Software *
|
||||
#* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 *
|
||||
#* USA *
|
||||
#* *
|
||||
#***************************************************************************
|
||||
|
||||
__title__="buildwikiindex.py"
|
||||
__author__ = "Yorik van Havre <yorik@uncreated.net>"
|
||||
__url__ = "http://www.freecadweb.org"
|
||||
|
||||
"""
|
||||
This script parses the contents of a wiki site and saves a file containing
|
||||
names of pages and images to be downloaded.
|
||||
"""
|
||||
|
||||
import sys, os, re, tempfile, getopt
|
||||
from urllib2 import urlopen, HTTPError
|
||||
|
||||
# CONFIGURATION #################################################
|
||||
|
||||
URL = "https://www.freecadweb.org/wiki" #default URL if no URL is passed
|
||||
INDEX = "Online_Help_Toc" # the start page from where to crawl the wiki
|
||||
NORETRIEVE = ['Manual','Developer_hub','Power_users_hub','Users_hub','Source_documentation', 'User_hub','Main_Page','About_this_site','Interesting_links','Syndication_feeds','FreeCAD:General_disclaimer','FreeCAD:About','FreeCAD:Privacy_policy','WikiPages'] # pages that won't be fetched (kept online)
|
||||
GETTRANSLATIONS = False # Set true if you want to get the translations too.
|
||||
MAXFAIL = 3 # max number of retries if download fails
|
||||
VERBOSE = True # to display what's going on. Otherwise, runs totally silent.
|
||||
WRITETHROUGH = True # if true, fetched files are constantly written to disk, in case of failure.
|
||||
|
||||
# END CONFIGURATION ##############################################
|
||||
|
||||
wikiindex = "/index.php?title="
|
||||
|
||||
def crawl(pagename=[]):
|
||||
"downloads an entire wiki site"
|
||||
todolist = []
|
||||
processed = []
|
||||
count = 1
|
||||
if pagename:
|
||||
if not isinstance(pagename,list):
|
||||
pagename = [pagename]
|
||||
todolist = pagename
|
||||
else:
|
||||
if os.path.exists("wikifiles.txt"):
|
||||
f = open("wikifiles.txt","r")
|
||||
if VERBOSE: print "Reading existing list..."
|
||||
for l in f.readlines():
|
||||
if l.strip() != "":
|
||||
if VERBOSE: print "Adding ",l
|
||||
processed.append(l.strip())
|
||||
f.close()
|
||||
if os.path.exists("todolist.txt"):
|
||||
f = open("todolist.txt","r")
|
||||
if VERBOSE: print "Reading existing todo list..."
|
||||
for l in f.readlines():
|
||||
if l.strip() != "":
|
||||
todolist.append(l.strip())
|
||||
f.close()
|
||||
else:
|
||||
indexpages,imgs = get(INDEX)
|
||||
todolist.extend(indexpages)
|
||||
while todolist:
|
||||
targetpage = todolist.pop()
|
||||
if (not targetpage in NORETRIEVE):
|
||||
if VERBOSE: print count, ": Scanning ", targetpage
|
||||
pages,images = get(targetpage)
|
||||
count += 1
|
||||
processed.append(targetpage)
|
||||
processed.extend(images)
|
||||
if VERBOSE: print "got",len(pages),"links"
|
||||
for p in pages:
|
||||
if (not (p in todolist)) and (not (p in processed)):
|
||||
todolist.append(p)
|
||||
if WRITETHROUGH:
|
||||
writeList(processed)
|
||||
writeList(todolist,"todolist.txt")
|
||||
if VERBOSE: print "Fetched ", count, " pages"
|
||||
if not WRITETHROUGH:
|
||||
writeList(processed)
|
||||
if pagename:
|
||||
return processed
|
||||
return 0
|
||||
|
||||
def get(page):
|
||||
"downloads a single page, returns the other pages it links to"
|
||||
html = fetchpage(page)
|
||||
html = cleanhtml(html)
|
||||
pages = getlinks(html)
|
||||
images = getimagelinks(html)
|
||||
return pages,images
|
||||
|
||||
def cleanhtml(html):
|
||||
"cleans given html code from dirty script stuff"
|
||||
html = html.replace('\n','Wlinebreak') # removing linebreaks for regex processing
|
||||
html = re.compile('(.*)<div[^>]+column-content+[^>]+>').sub('',html) # stripping before content
|
||||
html = re.compile('<div[^>]+column-one+[^>]+>.*').sub('',html) # stripping after content
|
||||
html = re.compile('<!--[^>]+-->').sub('',html) # removing comment tags
|
||||
html = re.compile('<script[^>]*>.*?</script>').sub('',html) # removing script tags
|
||||
html = re.compile('<!--\[if[^>]*>.*?endif\]-->').sub('',html) # removing IE tags
|
||||
html = re.compile('<div id="jump-to-nav"[^>]*>.*?</div>').sub('',html) # removing nav div
|
||||
html = re.compile('<h3 id="siteSub"[^>]*>.*?</h3>').sub('',html) # removing print subtitle
|
||||
html = re.compile('Retrieved from').sub('Online version:',html) # changing online title
|
||||
html = re.compile('<div id="mw-normal-catlinks[^>]>.*?</div>').sub('',html) # removing catlinks
|
||||
html = re.compile('<div class="NavHead.*?</div>').sub('',html) # removing nav stuff
|
||||
html = re.compile('<div class="NavContent.*?</div>').sub('',html) # removing nav stuff
|
||||
html = re.compile('<div class="NavEnd.*?</div>').sub('',html) # removing nav stuff
|
||||
html = re.compile('<div class="mw-pt-translate-header.*?</div>').sub('',html) # removing translations links
|
||||
if not GETTRANSLATIONS:
|
||||
html = re.compile('<div class="languages.*?</div>').sub('',html) # removing translations links
|
||||
html = re.compile('<div class="mw-pt-languages.*?</div>').sub('',html) # removing translations links
|
||||
html = re.compile('Wlinebreak').sub('\n',html) # restoring original linebreaks
|
||||
return html
|
||||
|
||||
def getlinks(html):
|
||||
"returns a list of wikipage links in html file"
|
||||
global NORETRIEVE
|
||||
links = re.findall('<a[^>]*>.*?</a>',html)
|
||||
pages = []
|
||||
for l in links:
|
||||
# rg = re.findall('php\?title=(.*)\" title',l)
|
||||
rg = re.findall('href=.*?php\?title=(.*?)"',l)
|
||||
if not rg:
|
||||
rg = re.findall('href="\/wiki\/(.*?)"',l)
|
||||
if "images" in rg:
|
||||
rg = None
|
||||
if rg:
|
||||
rg = rg[0]
|
||||
if not "Command_Reference" in rg:
|
||||
if "#" in rg:
|
||||
rg = rg.split('#')[0]
|
||||
if ":" in rg:
|
||||
NORETRIEVE.append(rg)
|
||||
if "&" in rg:
|
||||
NORETRIEVE.append(rg)
|
||||
if ";" in rg:
|
||||
NORETRIEVE.append(rg)
|
||||
if "/" in rg:
|
||||
if not GETTRANSLATIONS:
|
||||
NORETRIEVE.append(rg)
|
||||
if not rg in NORETRIEVE:
|
||||
pages.append(rg)
|
||||
print "got link: ",rg
|
||||
return pages
|
||||
|
||||
def getimagelinks(html):
|
||||
"returns a list of image links found in an html file"
|
||||
imlinks = re.findall('<img.*?src="(.*?)"',html)
|
||||
imlinks = [l for l in imlinks if not l.startswith("http")] # remove external images
|
||||
return imlinks
|
||||
|
||||
def fetchpage(page):
|
||||
"retrieves given page from the wiki"
|
||||
print "fetching: ",page
|
||||
failcount = 0
|
||||
while failcount < MAXFAIL:
|
||||
try:
|
||||
html = (urlopen(URL + wikiindex + page).read())
|
||||
return html
|
||||
except HTTPError:
|
||||
failcount += 1
|
||||
print 'Error: unable to fetch page ' + page
|
||||
sys.exit()
|
||||
|
||||
def cleanList(pagelist):
|
||||
"cleans the list"
|
||||
npages = []
|
||||
for p in pagelist:
|
||||
if not p in npages:
|
||||
if not "redlink" in p:
|
||||
npages.append(p)
|
||||
return npages
|
||||
|
||||
def writeList(pages,filename="wikifiles.txt"):
|
||||
pages = cleanList(pages)
|
||||
f = open(filename,"wb")
|
||||
for p in pages:
|
||||
f.write(p+"\n")
|
||||
f.close()
|
||||
if VERBOSE: print "written ",filename
|
||||
|
||||
if __name__ == "__main__":
|
||||
crawl(sys.argv[1:])
|
||||
|
347
downloadwiki.py
Executable file
|
@ -0,0 +1,347 @@
|
|||
#!/usr/bin/env python
|
||||
|
||||
#***************************************************************************
|
||||
#* *
|
||||
#* Copyright (c) 2009 Yorik van Havre <yorik@uncreated.net> *
|
||||
#* *
|
||||
#* This program is free software; you can redistribute it and/or modify *
|
||||
#* it under the terms of the GNU Lesser General Public License (LGPL) *
|
||||
#* as published by the Free Software Foundation; either version 2 of *
|
||||
#* the License, or (at your option) any later version. *
|
||||
#* for detail see the LICENCE text file. *
|
||||
#* *
|
||||
#* This program is distributed in the hope that it will be useful, *
|
||||
#* but WITHOUT ANY WARRANTY; without even the implied warranty of *
|
||||
#* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the *
|
||||
#* GNU Library General Public License for more details. *
|
||||
#* *
|
||||
#* You should have received a copy of the GNU Library General Public *
|
||||
#* License along with this program; if not, write to the Free Software *
|
||||
#* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 *
|
||||
#* USA *
|
||||
#* *
|
||||
#***************************************************************************
|
||||
|
||||
__title__="downloadwiki"
|
||||
__author__ = "Yorik van Havre <yorik@uncreated.net>"
|
||||
__url__ = "http://www.freecadweb.org"
|
||||
|
||||
"""
|
||||
This script retrieves the contents of a wiki site from a pages list
|
||||
"""
|
||||
|
||||
import sys, os, re, tempfile, getopt
|
||||
from urllib2 import urlopen, HTTPError
|
||||
|
||||
# CONFIGURATION #################################################
|
||||
|
||||
DEFAULTURL = "https://www.freecadweb.org" #default URL if no URL is passed
|
||||
INDEX = "Online_Help_Toc" # the start page from where to crawl the wiki
|
||||
NORETRIEVE = ['Manual','Developer_hub','Power_users_hub','Users_hub','Source_documentation', 'User_hub','Main_Page','About_this_site','FreeCAD:General_disclaimer','FreeCAD:About','FreeCAD:Privacy_policy','Introduction_to_python'] # pages that won't be fetched (kept online)
|
||||
GETTRANSLATIONS = False # Set true if you want to get the translations too.
|
||||
MAXFAIL = 3 # max number of retries if download fails
|
||||
VERBOSE = True # to display what's going on. Otherwise, runs totally silent.
|
||||
|
||||
# END CONFIGURATION ##############################################
|
||||
|
||||
FOLDER = "./localwiki"
|
||||
LISTFILE = "wikifiles.txt"
|
||||
URL = DEFAULTURL
|
||||
wikiindex = "/wiki/index.php?title="
|
||||
defaultfile = "<html><head><link type='text/css' href='wiki.css' rel='stylesheet'></head><body> </body></html>"
|
||||
css = """/* Basic CSS for offline wiki rendering */
|
||||
|
||||
body {
|
||||
font-family: Fira Sans,Arial,Helvetica,sans-serif;
|
||||
font-size: 14px;
|
||||
text-align: justify;
|
||||
/*background: #fff;
|
||||
color: #000;*/
|
||||
max-width: 800px;
|
||||
}
|
||||
|
||||
h1 {
|
||||
font-size: 2.4em;
|
||||
font-weight: bold;
|
||||
padding: 5px;
|
||||
border-radius: 5px;
|
||||
}
|
||||
|
||||
h2 {
|
||||
font-weight: normal;
|
||||
font-size: 1.6em;
|
||||
border-bottom: 1px solid #ddd;
|
||||
}
|
||||
|
||||
h3 {
|
||||
padding-left: 20px;
|
||||
}
|
||||
|
||||
img {
|
||||
max-width: 100%;
|
||||
}
|
||||
|
||||
li {
|
||||
margin-top: 10px;
|
||||
}
|
||||
|
||||
pre, .mw-code {
|
||||
text-align: left;
|
||||
/*background: #eee;*/
|
||||
padding: 5px 5px 5px 20px;
|
||||
font-family: mono;
|
||||
border-radius: 2px;
|
||||
}
|
||||
|
||||
a:link, a:visited {
|
||||
font-weight: bold;
|
||||
text-decoration: none;
|
||||
color: #2969C4;
|
||||
}
|
||||
|
||||
a:hover {
|
||||
text-decoration: underline;
|
||||
}
|
||||
|
||||
.printfooter {
|
||||
font-size: 0.8em;
|
||||
color: #333333;
|
||||
border-top: 1px solid #333;
|
||||
margin-top: 20px;
|
||||
}
|
||||
|
||||
.wikitable #toc {
|
||||
font-size: 0.8em;
|
||||
}
|
||||
|
||||
.ct, .ctTitle, .ctOdd, .ctEven th {
|
||||
font-size: 1em;
|
||||
text-align: left;
|
||||
width: 190px;
|
||||
float: right;
|
||||
/*background: #eee;*/
|
||||
margin-top: 10px;
|
||||
border-radius: 2px;
|
||||
}
|
||||
|
||||
.ct {
|
||||
margin-left: 15px;
|
||||
padding: 10px;
|
||||
}
|
||||
#mw-navigation {
|
||||
display:none; /*TODO remove on next build (included below)*/
|
||||
}
|
||||
"""
|
||||
|
||||
def crawl():
|
||||
"downloads an entire wiki site"
|
||||
global processed
|
||||
processed = []
|
||||
if VERBOSE: print "crawling ", URL, ", saving in ", FOLDER
|
||||
if not os.path.isdir(FOLDER): os.mkdir(FOLDER)
|
||||
file = open(FOLDER + os.sep + "wiki.css",'wb')
|
||||
file.write(css)
|
||||
file.close()
|
||||
dfile = open(FOLDER + os.sep + "default.html",'wb')
|
||||
dfile.write(defaultfile)
|
||||
dfile.close()
|
||||
lfile = open(LISTFILE)
|
||||
global locallist
|
||||
locallist = []
|
||||
for l in lfile: locallist.append(l.replace("\n",""))
|
||||
lfile.close()
|
||||
todolist = locallist[:]
|
||||
print "getting",len(todolist),"files..."
|
||||
count = 1
|
||||
indexpages = get(INDEX)
|
||||
while todolist:
|
||||
targetpage = todolist.pop()
|
||||
if VERBOSE: print count, ": Fetching ", targetpage
|
||||
get(targetpage)
|
||||
count += 1
|
||||
if VERBOSE: print "Fetched ", count, " pages"
|
||||
if VERBOSE: print "All done!"
|
||||
return 0
|
||||
|
||||
def get(page):
|
||||
"downloads a single page, returns the other pages it links to"
|
||||
localpage = page
|
||||
if "Command_Reference" in localpage:
|
||||
localpage = localpage.replace("Category:","")
|
||||
localpage = localpage.replace("&pagefrom=","+")
|
||||
localpage = localpage.replace("#mw-pages","")
|
||||
if page[-4:] in [".png",".jpg",".svg",".gif","jpeg",".PNG",".JPG"]:
|
||||
fetchimage(page)
|
||||
elif not exists(localpage):
|
||||
html = fetchpage(page)
|
||||
html = cleanhtml(html)
|
||||
pages = getlinks(html)
|
||||
html = cleanlinks(html,pages)
|
||||
html = cleanimagelinks(html)
|
||||
output(html,page)
|
||||
else:
|
||||
if VERBOSE: print " skipping",page
|
||||
|
||||
def getlinks(html):
|
||||
"returns a list of wikipage links in html file"
|
||||
links = re.findall('<a[^>]*>.*?</a>',html)
|
||||
pages = []
|
||||
for l in links:
|
||||
# rg = re.findall('php\?title=(.*)\" title',l)
|
||||
rg = re.findall('href=.*?php\?title=(.*?)"',l)
|
||||
if not rg:
|
||||
rg = re.findall('href="\/wiki\/(.*?)"',l)
|
||||
if rg:
|
||||
rg = rg[0]
|
||||
if not "Command_Reference" in rg:
|
||||
if "#" in rg:
|
||||
rg = rg.split('#')[0]
|
||||
if ":" in rg:
|
||||
NORETRIEVE.append(rg)
|
||||
if ";" in rg:
|
||||
NORETRIEVE.append(rg)
|
||||
if "&" in rg:
|
||||
NORETRIEVE.append(rg)
|
||||
if "/" in rg:
|
||||
if not GETTRANSLATIONS:
|
||||
NORETRIEVE.append(rg)
|
||||
pages.append(rg)
|
||||
return pages
|
||||
|
||||
def getimagelinks(html):
|
||||
"returns a list of image links found in an html file"
|
||||
return re.findall('<img.*?src="(.*?)"',html)
|
||||
|
||||
def cleanhtml(html):
|
||||
"cleans given html code from dirty script stuff"
|
||||
html = html.replace('\n','Wlinebreak') # removing linebreaks for regex processing
|
||||
html = html.replace('\t','') # removing tab marks
|
||||
html = re.compile('(.*)<div id=\"content+[^>]+>').sub('',html) # stripping before content
|
||||
html = re.compile('<div id="mw-head+[^>]+>.*').sub('',html) # stripping after content
|
||||
html = re.compile('<!--[^>]+-->').sub('',html) # removing comment tags
|
||||
html = re.compile('<script[^>]*>.*?</script>').sub('',html) # removing script tags
|
||||
html = re.compile('<!--\[if[^>]*>.*?endif\]-->').sub('',html) # removing IE tags
|
||||
html = re.compile('<div id="jump-to-nav"[^>]*>.*?</div>').sub('',html) # removing nav div
|
||||
html = re.compile('<h3 id="siteSub"[^>]*>.*?</h3>').sub('',html) # removing print subtitle
|
||||
html = re.compile('Retrieved from').sub('Online version:',html) # changing online title
|
||||
html = re.compile('<div id="mw-normal-catlinks.*?</div>').sub('',html) # removing catlinks
|
||||
html = re.compile('<div class="NavHead.*?</div>').sub('',html) # removing nav stuff
|
||||
html = re.compile('<div class="NavContent.*?</div>').sub('',html) # removing nav stuff
|
||||
html = re.compile('<div class="NavEnd.*?</div>').sub('',html) # removing nav stuff
|
||||
html = re.compile('<div id="mw-navigation.*?</div>').sub('',html) # removing nav stuff
|
||||
html = re.compile('<table id="toc.*?</table>').sub('',html) # removing toc
|
||||
html = re.compile('width=\"100%\" style=\"float: right; width: 230px; margin-left: 1em\"').sub('',html) # removing command box styling
|
||||
html = re.compile('<div class="docnav.*?</div>Wlinebreak</div>').sub('',html) # removing docnav
|
||||
html = re.compile('<div class="mw-pt-translate-header.*?</div>').sub('',html) # removing translations links
|
||||
if not GETTRANSLATIONS:
|
||||
html = re.compile('<div class="languages.*?</div>').sub('',html) # removing translations links
|
||||
html = re.compile('<div class="mw-pt-languages.*?</div>').sub('',html) # removing translations links
|
||||
html = re.compile('Wlinebreak').sub('\n',html) # restoring original linebreaks
|
||||
return html
|
||||
|
||||
|
||||
def cleanlinks(html, pages=None):
|
||||
"cleans page links found in html"
|
||||
if not pages: pages = getlinks(html)
|
||||
for page in pages:
|
||||
if page in NORETRIEVE:
|
||||
output = 'href="' + URL + wikiindex + page + '"'
|
||||
else:
|
||||
output = 'href="' + page.replace("/","-") + '.html"'
|
||||
html = re.compile('href="[^"]+' + page + '"').sub(output,html)
|
||||
if "Command_Reference" in output:
|
||||
html = html.replace("Category:","")
|
||||
html = html.replace("&pagefrom=","+")
|
||||
html = html.replace("#mw-pages",".html")
|
||||
html = html.replace("/wiki/index.php?title=Command_Reference","Command_Reference")
|
||||
return html
|
||||
|
||||
def cleanimagelinks(html,links=None):
|
||||
"cleans image links in given html"
|
||||
if not links: links = getimagelinks(html)
|
||||
if links:
|
||||
for l in links:
|
||||
nl = re.findall('.*/(.*)',l)
|
||||
if nl: html = html.replace(l,nl[0])
|
||||
# fetchimage(l)
|
||||
return html
|
||||
|
||||
def fetchpage(page):
|
||||
"retrieves given page from the wiki"
|
||||
print " fetching: ",page
|
||||
failcount = 0
|
||||
while failcount < MAXFAIL:
|
||||
try:
|
||||
html = (urlopen(URL + wikiindex + page).read())
|
||||
return html
|
||||
except HTTPError:
|
||||
failcount += 1
|
||||
print 'Error: unable to fetch page ' + page
|
||||
|
||||
def fetchimage(imagelink):
|
||||
"retrieves given image from the wiki and saves it"
|
||||
if imagelink[0:5] == "File:":
|
||||
print "Skipping file page link"
|
||||
return
|
||||
filename = re.findall('.*/(.*)',imagelink)[0]
|
||||
if not exists(filename,image=True):
|
||||
failcount = 0
|
||||
while failcount < MAXFAIL:
|
||||
try:
|
||||
if VERBOSE: print " fetching " + filename
|
||||
data = (urlopen(URL + imagelink).read())
|
||||
path = local(filename,image=True)
|
||||
file = open(path,'wb')
|
||||
file.write(data)
|
||||
file.close()
|
||||
except:
|
||||
failcount += 1
|
||||
else:
|
||||
processed.append(filename)
|
||||
if VERBOSE: print " saving",local(filename,image=True)
|
||||
return
|
||||
print 'Error: unable to fetch file ' + filename
|
||||
else:
|
||||
if VERBOSE: print " skipping",filename
|
||||
|
||||
def local(page,image=False):
|
||||
"returns a local path for a given page/image"
|
||||
if image:
|
||||
return FOLDER + os.sep + page
|
||||
else:
|
||||
return FOLDER + os.sep + page + '.html'
|
||||
|
||||
def exists(page,image=False):
|
||||
"checks if given page/image already exists"
|
||||
path = local(page.replace("/","-"),image)
|
||||
if os.path.exists(path): return True
|
||||
return False
|
||||
|
||||
def webroot(url):
|
||||
return re.findall('(http://.*?)/',url)[0]
|
||||
|
||||
def output(html,page):
|
||||
"encapsulates raw html code into nice html body"
|
||||
title = page.replace("_"," ")
|
||||
header = "<html><head>"
|
||||
header += "<title>" + title + "</title>"
|
||||
header += '<meta http-equiv="Content-Type" content="text/html; charset=utf-8">'
|
||||
header += "<link type='text/css' href='wiki.css' rel='stylesheet'>"
|
||||
header += "</head><body>"
|
||||
header += "<h1>" + title + "</h1>"
|
||||
footer = "</body></html>"
|
||||
html = header+html+footer
|
||||
filename = local(page.replace("/","-"))
|
||||
if "Command_Reference" in filename:
|
||||
filename = filename.replace("Category:","")
|
||||
filename = filename.replace("&pagefrom=","+")
|
||||
filename = filename.replace("#mw-pages","")
|
||||
filename = filename.replace(".html.html",".html")
|
||||
print " saving",filename
|
||||
file = open(filename,'wb')
|
||||
file.write(html)
|
||||
file.close()
|
||||
|
||||
if __name__ == "__main__":
|
||||
crawl()
|
||||
|
BIN
freecad-icon-64.png
Normal file
After Width: | Height: | Size: 3.8 KiB |
BIN
localwiki/013-arch-axes.jpg
Normal file
After Width: | Height: | Size: 97 KiB |
BIN
localwiki/013-arch-vrm.jpg
Normal file
After Width: | Height: | Size: 82 KiB |
BIN
localwiki/013-arch-wall.jpg
Normal file
After Width: | Height: | Size: 72 KiB |
BIN
localwiki/013-draft-fillet.jpg
Normal file
After Width: | Height: | Size: 69 KiB |
BIN
localwiki/013-draft-shape2dview.jpg
Normal file
After Width: | Height: | Size: 111 KiB |
BIN
localwiki/013-draft-snap.jpg
Normal file
After Width: | Height: | Size: 50 KiB |
BIN
localwiki/1000px-Macro_Boolean_Overlap_Screenshot.png
Normal file
After Width: | Height: | Size: 278 KiB |
BIN
localwiki/1000px-Macro_Section_Screenshot.png
Normal file
After Width: | Height: | Size: 224 KiB |
BIN
localwiki/1000px-PartDesign_ModlingObjectsHirachy.png
Normal file
After Width: | Height: | Size: 164 KiB |
BIN
localwiki/1000px-ResourceFramework.png
Normal file
After Width: | Height: | Size: 327 KiB |
BIN
localwiki/1000px-Thread-by-horz-profile-profileMake.png
Normal file
After Width: | Height: | Size: 327 KiB |
BIN
localwiki/100px-Bevelgear.png
Normal file
After Width: | Height: | Size: 9.0 KiB |
BIN
localwiki/100px-Cycloidegear.png
Normal file
After Width: | Height: | Size: 8.0 KiB |
BIN
localwiki/100px-Involutegear.png
Normal file
After Width: | Height: | Size: 7.3 KiB |
BIN
localwiki/100px-Involuterack.png
Normal file
After Width: | Height: | Size: 5.2 KiB |
BIN
localwiki/100px-Tutorial-treeview.jpg
Normal file
After Width: | Height: | Size: 4.8 KiB |
BIN
localwiki/1024px-216.png
Normal file
After Width: | Height: | Size: 360 KiB |
BIN
localwiki/1024px-6DPLEQ2.jpg
Normal file
After Width: | Height: | Size: 128 KiB |
BIN
localwiki/1024px-Arch_clip_plane.jpg
Normal file
After Width: | Height: | Size: 195 KiB |
BIN
localwiki/1024px-Arch_workflow_example.jpg
Normal file
After Width: | Height: | Size: 89 KiB |
BIN
localwiki/1024px-BaseStation004.JPG
Normal file
After Width: | Height: | Size: 172 KiB |
BIN
localwiki/1024px-Cura_export.png
Normal file
After Width: | Height: | Size: 332 KiB |
BIN
localwiki/1024px-Draft_dimensions_recode.jpg
Normal file
After Width: | Height: | Size: 131 KiB |
BIN
localwiki/1024px-Draft_hatches.jpg
Normal file
After Width: | Height: | Size: 122 KiB |
BIN
localwiki/1024px-Draft_subdivisions.jpg
Normal file
After Width: | Height: | Size: 100 KiB |
BIN
localwiki/1024px-DrawingWB.png
Normal file
After Width: | Height: | Size: 173 KiB |
BIN
localwiki/1024px-Drawing_spreadsheetview.jpg
Normal file
After Width: | Height: | Size: 105 KiB |
BIN
localwiki/1024px-Drill-FreeCAD.png
Normal file
After Width: | Height: | Size: 316 KiB |
BIN
localwiki/1024px-Easyw_fc.png
Normal file
After Width: | Height: | Size: 267 KiB |
BIN
localwiki/1024px-FreeCAD-guitar.jpg
Normal file
After Width: | Height: | Size: 116 KiB |
BIN
localwiki/1024px-FreeCAD_aeroponic_system.jpg
Normal file
After Width: | Height: | Size: 146 KiB |
BIN
localwiki/1024px-Freecad-bearing.png
Normal file
After Width: | Height: | Size: 190 KiB |
BIN
localwiki/1024px-Freecad-interface.jpg
Normal file
After Width: | Height: | Size: 149 KiB |
BIN
localwiki/1024px-Freecad_default.jpg
Normal file
After Width: | Height: | Size: 172 KiB |
BIN
localwiki/1024px-Freecad_jeep.png
Normal file
After Width: | Height: | Size: 232 KiB |
BIN
localwiki/1024px-Gsuter.png
Normal file
After Width: | Height: | Size: 304 KiB |
BIN
localwiki/1024px-Hhassey.png
Normal file
After Width: | Height: | Size: 163 KiB |
BIN
localwiki/1024px-JMG.png
Normal file
After Width: | Height: | Size: 312 KiB |
BIN
localwiki/1024px-Lhf.jpg
Normal file
After Width: | Height: | Size: 138 KiB |
BIN
localwiki/1024px-Mesh_curvature_plot1.jpeg
Normal file
After Width: | Height: | Size: 219 KiB |
BIN
localwiki/1024px-Obijuan.png
Normal file
After Width: | Height: | Size: 401 KiB |
BIN
localwiki/1024px-Obijuan2.png
Normal file
After Width: | Height: | Size: 441 KiB |
BIN
localwiki/1024px-Partdesign_example.jpg
Normal file
After Width: | Height: | Size: 91 KiB |
BIN
localwiki/1024px-Pic_06.jpg
Normal file
After Width: | Height: | Size: 177 KiB |
BIN
localwiki/1024px-PrzemoF.png
Normal file
After Width: | Height: | Size: 167 KiB |
BIN
localwiki/1024px-R_tec.jpeg
Normal file
After Width: | Height: | Size: 146 KiB |
BIN
localwiki/1024px-Raytracing_example.jpg
Normal file
After Width: | Height: | Size: 121 KiB |
BIN
localwiki/1024px-Rim_bling.png
Normal file
After Width: | Height: | Size: 271 KiB |
BIN
localwiki/1024px-Rockn.png
Normal file
After Width: | Height: | Size: 104 KiB |
BIN
localwiki/1024px-Rockn_house1.png
Normal file
After Width: | Height: | Size: 306 KiB |
BIN
localwiki/1024px-Rockn_house2.png
Normal file
After Width: | Height: | Size: 296 KiB |
BIN
localwiki/1024px-Satnogs_Rotator_FreeCAD.jpg
Normal file
After Width: | Height: | Size: 170 KiB |
BIN
localwiki/1024px-Scharniergreifer_render.jpg
Normal file
After Width: | Height: | Size: 89 KiB |
BIN
localwiki/1024px-Screenshot_from_2018-01-25_20-53-18.jpg
Normal file
After Width: | Height: | Size: 131 KiB |
BIN
localwiki/1024px-Shaftwizard1.jpg
Normal file
After Width: | Height: | Size: 144 KiB |
BIN
localwiki/1024px-Spark-Plug-Plane.jpg
Normal file
After Width: | Height: | Size: 128 KiB |
BIN
localwiki/1024px-Staeubli_step_import.png
Normal file
After Width: | Height: | Size: 200 KiB |
BIN
localwiki/1024px-Startcenter.jpg
Normal file
After Width: | Height: | Size: 151 KiB |
BIN
localwiki/1024px-Style_Sheets.png
Normal file
After Width: | Height: | Size: 400 KiB |
BIN
localwiki/1024px-VIIC_2.jpg
Normal file
After Width: | Height: | Size: 124 KiB |
BIN
localwiki/1024px-Wheel.JPG
Normal file
After Width: | Height: | Size: 275 KiB |
BIN
localwiki/118px-WaWue.JPG
Normal file
After Width: | Height: | Size: 4.7 KiB |
BIN
localwiki/120px-Band.JPG
Normal file
After Width: | Height: | Size: 3.8 KiB |
BIN
localwiki/120px-Blade.JPG
Normal file
After Width: | Height: | Size: 3.2 KiB |
BIN
localwiki/120px-Bottle.jpg
Normal file
After Width: | Height: | Size: 4.8 KiB |
BIN
localwiki/120px-Bottle_detail.jpg
Normal file
After Width: | Height: | Size: 5.9 KiB |
BIN
localwiki/120px-Bottle_detail_interpol.jpg
Normal file
After Width: | Height: | Size: 6.4 KiB |
BIN
localwiki/120px-Bottle_raw.jpg
Normal file
After Width: | Height: | Size: 4.0 KiB |
BIN
localwiki/120px-LaserLine.JPG
Normal file
After Width: | Height: | Size: 3.0 KiB |
BIN
localwiki/120px-WaWue_SphrerFit.jpg
Normal file
After Width: | Height: | Size: 4.5 KiB |
BIN
localwiki/120px-Wawue_Side.jpg
Normal file
After Width: | Height: | Size: 5.0 KiB |
BIN
localwiki/120px-Wawue_Top.jpg
Normal file
After Width: | Height: | Size: 4.0 KiB |
BIN
localwiki/12px-Draft_AddPoint.png
Normal file
After Width: | Height: | Size: 413 B |
BIN
localwiki/12px-Draft_CloseLine.png
Normal file
After Width: | Height: | Size: 523 B |
BIN
localwiki/12px-Draft_DelPoint.png
Normal file
After Width: | Height: | Size: 327 B |
BIN
localwiki/12px-Draft_Edit.png
Normal file
After Width: | Height: | Size: 482 B |
BIN
localwiki/12px-Draft_FinishLine.png
Normal file
After Width: | Height: | Size: 519 B |
BIN
localwiki/12px-Draft_UndoLine.png
Normal file
After Width: | Height: | Size: 482 B |
BIN
localwiki/12px-Draft_Wipe.png
Normal file
After Width: | Height: | Size: 290 B |
BIN
localwiki/133px-Macro_Draft_Circle_3_Points02.png
Normal file
After Width: | Height: | Size: 3.3 KiB |
BIN
localwiki/133px-Macro_Draft_Circle_3_Points03.png
Normal file
After Width: | Height: | Size: 2.2 KiB |
BIN
localwiki/133px-Macro_Draft_Circle_3_Points04.png
Normal file
After Width: | Height: | Size: 5.4 KiB |
BIN
localwiki/133px-Macro_Draft_Circle_3_Points05.png
Normal file
After Width: | Height: | Size: 4.8 KiB |
BIN
localwiki/142px-Draft_Selection_Menu.png
Normal file
After Width: | Height: | Size: 28 KiB |
BIN
localwiki/142px-FreeCAD_Menu_Edition_Peferences.png
Normal file
After Width: | Height: | Size: 28 KiB |
BIN
localwiki/150px-Drawing_View_Iso.png
Normal file
After Width: | Height: | Size: 8.2 KiB |
BIN
localwiki/150px-Drawing_View_Iso_SmoothLines.png
Normal file
After Width: | Height: | Size: 8.6 KiB |
BIN
localwiki/150px-HelpViewer.jpg
Normal file
After Width: | Height: | Size: 11 KiB |
BIN
localwiki/150px-Macro_Cross_Section_01.png
Normal file
After Width: | Height: | Size: 12 KiB |
BIN
localwiki/150px-Tache_Placement_Translation_X_fr.gif
Normal file
After Width: | Height: | Size: 5.2 KiB |
BIN
localwiki/150px-Tache_Placement_Translation_Y_fr.gif
Normal file
After Width: | Height: | Size: 6.1 KiB |
BIN
localwiki/150px-Tache_Placement_Translation_Z_fr.gif
Normal file
After Width: | Height: | Size: 6.3 KiB |
BIN
localwiki/158px-Texture_NanoDesign.png
Normal file
After Width: | Height: | Size: 35 KiB |
BIN
localwiki/167px-Qt_Example_00.png
Normal file
After Width: | Height: | Size: 24 KiB |