Updated 2011.01.23 17:22 +0100 |
Ikke tilgængelig på Dansk
Version 1.8 May 2003.
This article describes how you can improve the quality of your HTML/XHTML pages by integrating HTML Tidy into Microsoft FrontPage 2000, 2002 (Office XP) or 2003 (Beta 2 will do).
Related Article: Integrating Validation into Microsoft FrontPage.
In this new version 1.8 I've added appendix B, containing a preliminary explanation of how to use TidyATL.dll.
The code in this article requires:
The tidy_fp.zip file contains:
How often have you written web documents in editors or text processors that simply couldn't produce the underlying web language correctly? You may not be aware of it, but most of today's HTML editors are not very good at producing valid HTML. As a author of web documents, you have an interest in authoring your documents so that most of your readers actually can read it in one of the browsers available.
Most of my pages in my personal homepage are written as XHTML documents. This is the emerging standard for web documents (see www.w3.org). It's an XML-based version of the HTML standard, with some important differences such as:
For me it is most important that the code is "pretty", commented, and valid with respect to the right standards. This should be true whether you have written the code by hand in a regular text editor (like notepad), or generated it via a WYSIWYG editor (like FrontPage).
This article describes how you can improve the web documents written with the Microsoft FrontPage editor. I will mainly focus on the XHTML part. Microsoft FrontPage is just one out of many editors in which you can create web documents or manage/edit entire webs (collections of web documents). FrontPage is a fairly decent editor that produces good quality XHTML code, however it's not perfect.
HTML Tidy is a tool that can clean up the underlying code (tags) of your web document. Tidy is an open source project originally written by Dave Ragget. It's basically a command line tool that takes a HTML file and generates a new HTML file with cleaner code. The new HTML file is generated with code based upon a large set of rule and layout preferences that you specify either on the command line or in a configuration file (the preferred method).
You can download HTML tidy from http://tidy.sourceforge.net/.
When you have downloaded HTML Tidy you should take your time to familiarize yourself with it. Try to give it some HTML files as arguments and see what happens. For example you may try the following command line, e.g. giving HTML Tidy the -indent option and your file as argument:
C:\<path to HTML Tidy>\tidy.exe -indent <your file>.htm
You probably already have some preferences how the layout should look like and you may want to try tweaking the many different options via an configuration file.
At the HTML Tidy project page, a quick reference of the options can be found.
The configuration file is a simple text based file that you can write in e.g. Notepad. My configuration file (tidy.cfg) looks like the following:
// Config file for HTML tidy output-xhtml: yes doctype: strict char-encoding: raw tidy-mark: no markup: yes indent: auto indent-spaces: 2 tab-size: 2 wrap: 120 break-before-br: no drop-empty-paras: yes word-2000: yes clean: no write-back: yes keep-time: yes
Microsoft FrontPage, version 2000 or newer, supports Visual Basic® for Applications that we can utilize for integrating HTML Tidy into the menu system. A simple menu activation will clean/tidy your web document.
First download the zip file with the source code files and unzip it to a folder e.g. to C:\Program Files\Validator.
I have made two Visual Basic files available that wrap the call to HTML Tidy within VBA. The code is based on code originally written by Christoph Schneegans. See http://www.schneegans.de/frontpage-vba/tidy.html for details (in German though).
To incorporate these into FrontPage VBA see section 3.2 below.
The VBA code cannot be used directly but must be customized to the specified location of HTML Tidy. Four string constants should be defined:
|TIDY_PROGRAM_FILE||Should specify the full path and file name to the HTML Tidy executable e.g. as in C:\Program Files\Validator\Tidy.exe.|
|TIDY_CONFIG_FILE||Should specify the full path and file name to the HTML Tidy configuration file.|
|TIDY_ERROR||Should specify the full path and file name to the HTML Tidy error log file.|
|TIDY_TEMP_FILE||Should specify the full path and file name to a temporary file name.|
Attribute VB_Name = "Tidy" ' ' Tidy.bas - Integration with Tidy in FrontPage 2000 or newer ' ' Based on code by Christoph Schneegans <mailto:Christoph@Schneegans.de> ' See <http://www.schneegans.de/frontpage-vba/tidy.html> ' Option Explicit ' Specifies Path to Tidy executable... Const TIDY_PROGRAM_FILE = "C:\Program Files\Validator\Tidy.exe" ' Specifies path to Tidy configuration file... Const TIDY_CONFIG_FILE = "C:\Program Files\Validator\tidy.cfg" ' Specifies path to Tidy error files.... Const TIDY_ERROR_FILE = "C:\Program Files\Validator\tidy_errors.txt" ' Specifies path to Tidy temporary file... Const TIDY_TEMP_FILE = "C:\Program Files\Validator\tidy.tmp" '************************************ ' TIDY_FILE ' ' Sub Tidy_File() Dim bFlipToHTMLSource As Boolean bFlipToHTMLSource = False If ActivePageWindow Is Nothing Then MsgBox "Please open a file in the Frontpage Editor.", vbOKOnly Or vbCritical Exit Sub End If If Not ActivePageWindow.ViewMode = fpPageViewNormal Then bFlipToHTMLSource = True ActivePageWindow.ViewMode = fpPageViewNormal End If Dim doc As FPHTMLDocument Set doc = ActivePageWindow.Document Dim fs Set fs = CreateObject("Scripting.FileSystemObject") Dim ts Set ts = fs.CreateTextFile(TIDY_TEMP_FILE) ' Write the current active FrontPage document into the temporary file... ts.Write doc.DocumentHTML ts.Close Dim strCmd As String strCmd = Chr(34) & TIDY_PROGRAM_FILE & Chr(34) & _ " -f " & Chr(34) & TIDY_ERROR_FILE & Chr(34) & _ " -config " & Chr(34) & TIDY_CONFIG_FILE & Chr(34) & _ " " & Chr(34) & TIDY_TEMP_FILE & Chr(34) ' Execute the command line If ExecCmd(strCmd) > 1 Then Err.Raise vbObjectError + 513 ' Raise a user-defined error Exit Sub End If ' Open the result file (the temporary file)... Set ts = fs.OpenTextFile(TIDY_TEMP_FILE, 1) ' 1=ForReading ' Load it into the active document of FrontPage On Error GoTo TidyError doc.DocumentHTML = ts.ReadAll On Error GoTo 0 If bFlipToHTMLSource Then ActivePageWindow.ViewMode = fpPageViewHtml End If Dim es ' Read the TIDY_ERROR_FILE Set es = fs.OpenTextFile(TIDY_ERROR_FILE, 1) ' 1=ForReading ' Copy the content into the output form... Form_output.TextBox_output.Text = es.ReadAll Form_output.Caption = TIDY_ERROR_FILE Form_output.Show Exit Sub TidyError: MsgBox "Tidy could not execute correctly. No changes have been carried out." & Chr(10) & _ "Error # " & CStr(Err.Number) & " " & Err.Description, vbOKOnly Or vbCritical End Sub
You may add extra error level check after an execution of ExecCmd. ExecCmd() returns the error level from the executed file. For Tidy, "0" means "OK", "1" means "There are warnings", "2" means "There are errors". When errors occur, Tidy can't continue.
This section shows you how to customize an extra FrontPage menu with the call to the Tidy VBA sub procedure.
How to guide:
I have shown you how you can integrate HTML Tidy into FrontPage and thereby improve the overall quality of your web documents in an easy manner.
The next obvious thing to implement would be an offline validator that can be executed from within FrontPage. That would really be something that would increase the quality of the web documents. I suggest reading my accompanying article "Integrating Validation into Microsoft FrontPage" which explains how you can add offline validation to your web page. It's as good as the validator available from http://validator.w3.org/.
Within FrontPage 2000, a document containing the !DOCTYPE specification (typically the first line) may not be preserved after an execution of the Tidy macro.
This problem does not occur in FrontPage 2002.
Cause: Tidy is run on the ActiveDocument.DocumentHTML string which does not contain the DOCTYPE specification.
Resolution: Change to FrontPage 2002.
Comments: The ActiveDocument.DocumentHTML string may not contain all code from the web page, which may lead to other problems. One example is the text of the shared borders, which may contain HTML code. This HTML code will not be repaired by Tidy.
Recent HTML Tidy builds creates an XML declaration when 'output-xhtml' is set to 'yes'. This is, in fact, good. Older builds forgot the XML declaration, even when using other encodings than UTF-8 or UTF-16. I'm afraid this could cause big problems. Since FrontPage 2000 doesn't like XML declarations, it moves them into the body element.
In case of long file names, e.g. if you have installed HTML Tidy in a folder with long file names (e.g. under C:\Program Files), the VB code should be changed to quote the file names.
In tidy.bas the following should be modified from:
Dim strCmd As String strCmd = TIDY_PROGRAM_FILE & " -f " & TIDY_ERROR_FILE & _ " -config " & TIDY_CONFIG_FILE & " " & TIDY_TEMP_FILE
Dim strQuote As String Dim strCmd As String strQuote = Chr$(34) strCmd = strQuote & TIDY_PROGRAM_FILE & strQuote & " -f " & strQuote & TIDY_ERROR_FILE & strQuote & _ " -config " & strQuote & TIDY_CONFIG_FILE & strQuote & " " & strQuote & TIDY_TEMP_FILE & strQuote
This appendix contains a preliminary description of how to use the HTML Tidy ATL/COM Wrapper DLL as implemented by Charles Reitzel (see http://users.rcn.com/creitzel/tidy.html#comatl). One of the advantages of using this is that you potentially can avoid all the temporary files.
I still need to figure out to provide better message feedback. I would prefer a solution where all messages are outputted to a modeless window (dialog) visible to the FrontPage user and in which the user could click on the (error) line jumping directly into the source code on the position where the (error) problem were located.
Option Explicit ' Specifies path to Tidy configuration file... Const TIDY_CONFIG_FILE = "C:\Program Files\Validator\tidy.cfg" ' ************************************************* ' DoTidy ' Tidy' the current active page in FrontPage ' accordingly to the HTML Tidy configuation ' specified in the configuration file defined ' in the TIDY_CONFIG_FILE constant. ' ' Note: Tidy (error) message are captured in ' the OnMessage event handler in the ' projects TidyDocEventClass. ' Sub DoTidy() Dim bFlipToHTMLSource As Boolean bFlipToHTMLSource = False If ActivePageWindow Is Nothing Then MsgBox "Please open a file in the Frontpage Editor.", vbOKOnly Or vbCritical Exit Sub End If If Not ActivePageWindow.ViewMode = fpPageViewNormal Then bFlipToHTMLSource = True ActivePageWindow.ViewMode = fpPageViewNormal End If Dim oFPdoc As FPHTMLDocument ' The FrontPage document (e.g. in HTML) Set oFPdoc = ActivePageWindow.Document Dim oTidyDoc As TidyDocument Set oTidyDoc = New TidyDocument ' Setup class to handle events (OnMessage) from the Tidy document Dim oEvtClass As TidyDocEventClass Set oEvtClass = New TidyDocEventClass Set oEvtClass.TidyDoc = oTidyDoc Dim nStat As Long On Error GoTo TidyError nStat = 0 'If Len(sErrorFile) > 0 Then ' nStat = oTidyDoc.SetErrorFile(sErrorFile) 'End If If nStat >= 0 Then nStat = oTidyDoc.LoadConfig(TIDY_CONFIG_FILE) End If If nStat >= 0 Then nStat = oTidyDoc.ParseString(oFPdoc.DocumentHTML) End If If nStat >= 0 Then nStat = oTidyDoc.CleanAndRepair() End If If nStat >= 0 Then nStat = oTidyDoc.RunDiagnostics() End If If nStat >= 0 Then oFPdoc.DocumentHTML = oTidyDoc.SaveString() End If If bFlipToHTMLSource Then ActivePageWindow.ViewMode = fpPageViewHtml End If Exit Sub TidyError: MsgBox "Tidy could not execute correctly. No changes have been carried out." & _ Chr(10) & "Error # " & CStr(Err.Number) & " " & Err.Description, vbOKOnly Or vbCritical End Sub
Option Explicit Public WithEvents TidyDoc As TidyDocument ' ************************************************* ' TidyDoc_OnMessage ' Event handler (OnMessage) for TidyDocument. ' Private Sub TidyDoc_OnMessage(ByVal level As TidyReportLevel, ByVal nLine As Long, _ ByVal nCol As Long, ByVal sMsg As String) Dim sLevel As String, sLine As String If level = TidyInfo Then sLevel = "Info: " ElseIf level = TidyAccess Then sLevel = "Access: " ElseIf level = TidyWarning Then sLevel = "Warning: " ElseIf level = TidyConfig Then sLevel = "Config: " ElseIf level = TidyError Then sLevel = "Error: " ElseIf level = TidyBadDocument Then sLevel = "Doc: " ElseIf level = TidyFatal Then sLevel = "Fatal: " Else sLevel = "???: " End If If nLine > 0 Then sLine = sLevel & "Line " & nLine & "Col " & nCol & ", " & sMsg Else sLine = sLevel & sMsg End If MsgBox sLine End Sub