I'm Michael Suodenjoki - a software engineer living in Kgs. Lyngby, north of Copenhagen, Denmark. This is my personal site containing my blog, photos, articles and main interests.

Article

Integrating HTML Tidy into Microsoft FrontPage Improving standard compliance with Frontpage

Updated 2011.01.23 17:22 +0100

Ikke tilgængelig på Dansk

By Michael Suodenjoki, michael@suodenjoki.dk.

Version 1.8 May 2003.

Abstract

This article describes how you can improve the quality of your HTML/XHTML pages by integrating HTML Tidy into Microsoft FrontPage 2000, 2002 (Office XP) or 2003 (Beta 2 will do).

Related Article: Integrating Validation into Microsoft FrontPage.

In this new version 1.8 I've added appendix B, containing a preliminary explanation of how to use TidyATL.dll.

Feedback

Thanks for the help to integrate HTML tidy into FrontPage.
Barak Naveh, 2003.12.24
Integrated HTML Tidy (“HT”) into FP 2003 beta 2 this morning and I must first thank you (and your colleagues) for providing such a useful – and free! – utility.
Brian Smith, 2003.11.06

Requirements

The code in this article requires:

Download Source

Download the tidy_fp.zip source file for this article.

The tidy_fp.zip file contains:

Contents

1 Introduction
    1.1 HTML Tidy
2 Let's tweak the cleaning
3 Integration with FrontPage
    3.1 Customizing the VBA code
    3.2 Customizing the FrontPage menu
4 Conclusion

Appendix A: Known problems
Appendix B: Using TidyATL.dll

1 Introduction

How often have you written web documents in editors or text processors that simply couldn't produce the underlying web language correctly? You may not be aware of it, but most of today's HTML editors are not very good at producing valid HTML. As a author of web documents, you have an interest in authoring your documents so that most of your readers actually can read it in one of the browsers available.

Most of my pages in my personal homepage are written as XHTML documents. This is the emerging standard for web documents (see www.w3.org). It's an XML-based version of the HTML standard, with some important differences such as:

For me it is most important that the code is "pretty", commented, and valid with respect to the right standards. This should be true whether you have written the code by hand in a regular text editor (like notepad), or generated it via a WYSIWYG editor (like FrontPage).

This article describes how you can improve the web documents written with the Microsoft FrontPage editor. I will mainly focus on the XHTML part. Microsoft FrontPage is just one out of many editors in which you can create web documents or manage/edit entire webs (collections of web documents). FrontPage is a fairly decent editor that produces good quality XHTML code, however it's not perfect.

1.1 HTML Tidy

HTML Tidy is a tool that can clean up the underlying code (tags) of your web document. Tidy is an open source project originally written by Dave Ragget. It's basically a command line tool that takes a HTML file and generates a new HTML file with cleaner code. The new HTML file is generated with code based upon a large set of rule and layout preferences that you specify either on the command line or in a configuration file (the preferred method).

You can download HTML tidy from http://tidy.sourceforge.net/.

2 Let's tweak the cleaning

When you have downloaded HTML Tidy you should take your time to familiarize yourself with it. Try to give it some HTML files as arguments and see what happens. For example you may try the following command line, e.g. giving HTML Tidy the -indent option and your file as argument:

C:\<path to HTML Tidy>\tidy.exe -indent <your file>.htm

You probably already have some preferences how the layout should look like and you may want to try tweaking the many different options via an configuration file.

At the HTML Tidy project page, a quick reference of the options can be found.

The configuration file is a simple text based file that you can write in e.g. Notepad. My configuration file (tidy.cfg) looks like the following:

// Config file for HTML tidy
output-xhtml: yes 
doctype: strict
char-encoding: raw 
tidy-mark: no

markup: yes 
indent: auto
indent-spaces: 2
tab-size: 2
wrap: 120

break-before-br: no
drop-empty-paras: yes

word-2000: yes
clean: no

write-back: yes 
keep-time: yes

3 Integration with FrontPage

Microsoft FrontPage, version 2000 or newer, supports Visual Basic® for Applications that we can utilize for integrating HTML Tidy into the menu system. A simple menu activation will clean/tidy your web document.

First download the zip file with the source code files and unzip it to a folder e.g. to C:\Program Files\Validator.

I have made two Visual Basic files available that wrap the call to HTML Tidy within VBA. The code is based on code originally written by Christoph Schneegans. See http://www.schneegans.de/frontpage-vba/tidy.html for details (in German though).

To incorporate these into FrontPage VBA see section 3.2 below.

3.1 Customizing the VBA code

The VBA code cannot be used directly but must be customized to the specified location of HTML Tidy. Four string constants should be defined:

TIDY_PROGRAM_FILE Should specify the full path and file name to the HTML Tidy executable e.g. as in C:\Program Files\Validator\Tidy.exe.
TIDY_CONFIG_FILE Should specify the full path and file name to the HTML Tidy configuration file.
TIDY_ERROR Should specify the full path and file name to the HTML Tidy error log file.
TIDY_TEMP_FILE Should specify the full path and file name to a temporary file name.
Attribute VB_Name = "Tidy"
'
' Tidy.bas - Integration with Tidy in FrontPage 2000 or newer
'
' Based on code by Christoph Schneegans <mailto:Christoph@Schneegans.de>
' See <http://www.schneegans.de/frontpage-vba/tidy.html>
'

Option Explicit

' Specifies Path to Tidy executable...
Const TIDY_PROGRAM_FILE = "C:\Program Files\Validator\Tidy.exe"

' Specifies path to Tidy configuration file...
Const TIDY_CONFIG_FILE = "C:\Program Files\Validator\tidy.cfg"

' Specifies path to Tidy error files....
Const TIDY_ERROR_FILE = "C:\Program Files\Validator\tidy_errors.txt"

' Specifies path to Tidy temporary file...
Const TIDY_TEMP_FILE = "C:\Program Files\Validator\tidy.tmp"

'************************************
' TIDY_FILE
'
'
Sub Tidy_File()

  Dim bFlipToHTMLSource As Boolean
  bFlipToHTMLSource = False

  If ActivePageWindow Is Nothing Then
    MsgBox "Please open a file in the Frontpage Editor.", vbOKOnly Or vbCritical
    Exit Sub
  End If

  If Not ActivePageWindow.ViewMode = fpPageViewNormal Then
    bFlipToHTMLSource = True
    ActivePageWindow.ViewMode = fpPageViewNormal
  End If

  Dim doc As FPHTMLDocument
  Set doc = ActivePageWindow.Document

  Dim fs
  Set fs = CreateObject("Scripting.FileSystemObject")
  Dim ts
  Set ts = fs.CreateTextFile(TIDY_TEMP_FILE)

  ' Write the current active FrontPage document into the temporary file...
  ts.Write doc.DocumentHTML
  ts.Close

  Dim strCmd As String
  strCmd = Chr(34) & TIDY_PROGRAM_FILE & Chr(34) & _
           " -f " & Chr(34) & TIDY_ERROR_FILE & Chr(34) & _
           " -config " & Chr(34) & TIDY_CONFIG_FILE & Chr(34) & _
           " " & Chr(34) & TIDY_TEMP_FILE & Chr(34)

  ' Execute the command line
  If ExecCmd(strCmd) > 1 Then
    Err.Raise vbObjectError + 513 ' Raise a user-defined error
    Exit Sub
  End If

  ' Open the result file (the temporary file)...
  Set ts = fs.OpenTextFile(TIDY_TEMP_FILE, 1) ' 1=ForReading

  ' Load it into the active document of FrontPage
  On Error GoTo TidyError
  doc.DocumentHTML = ts.ReadAll
  On Error GoTo 0

  If bFlipToHTMLSource Then
    ActivePageWindow.ViewMode = fpPageViewHtml
  End If

  Dim es
  ' Read the TIDY_ERROR_FILE 
  Set es = fs.OpenTextFile(TIDY_ERROR_FILE, 1) ' 1=ForReading

  ' Copy the content into the output form...
  Form_output.TextBox_output.Text = es.ReadAll
  Form_output.Caption = TIDY_ERROR_FILE
  Form_output.Show

  Exit Sub

TidyError:
  MsgBox "Tidy could not execute correctly. No changes have been carried out." & Chr(10) & _
         "Error # " & CStr(Err.Number) & " " & Err.Description, vbOKOnly Or vbCritical

End Sub

You may add extra error level check after an execution of ExecCmd. ExecCmd() returns the error level from the executed file. For Tidy, "0" means "OK", "1" means "There are warnings", "2" means "There are errors". When errors occur, Tidy can't continue.

3.2 Customizing the FrontPage menu

This section shows you how to customize an extra FrontPage menu with the call to the Tidy VBA sub procedure.

How to guide:

  1. Open FrontPage.
  2. Activate the 'Tools|Macro|Visual Basic Editor' menu item. This should open up the VBA editor of FrontPage.
  3. Right click the Modules folder and select 'Import File...' and import the ExecuteCmd.bas file. Activate import file again and select the Tidy.bas file. See figure below. You should now have at least two modules in the Modules folder - one named ExecuteCmd and one named Tidy.
  4. If there is an empty Module1 module you can safely delete it.
  5. Import the form file Form_tidy_output.frm which defines the dialog that will be used to display the (output) result from tidy.
  6. Go into the Tidy module and define the 4 constants as described earlier in section 3.1.

    Figure illustrating the import file feature of VBA.
    » Figure illustrating the import file feature of VBA.

  7. Close the VBA editor by activating 'File|Close' or press Alt+Q. You're now back in FrontPage.
  8. Select 'Tools|Customize..." to open up the Customize dialog.
  9. Select the Commands tab in the Customize dialog (see figure below).

    The Tools|Customize dialog.
    » The Tools|Customize dialog.

  10. Select New Menu in the categories list box. The right hand side list box should contain at least one command available named "New Menu".
  11. Click and drag the "New Menu" command up to the main menu of FrontPage at a preferred location e.g. after the Format menu item. The location where you insert should be marked by a vertical insertion bar.
  12. Right click at your newly inserted menu item to open a special customize context menu.
  13. The context menu contains a menu item called New Name where you can specify a name for your menu item. I've used the "E&xtras" name, where the ampersand indicates which letter that acts as shortcut - and therefore will be underlined.
  14. In the Customize dialog select the Macro item from the categories list box. Two new commands should now be available at the right hand side. Select the "Custom Menu Item" and drag it to your new Extras menu.
  15. Again select the context menu by right clicking and give the new menu item a name e.g. "&Tidy Document".
  16. Within the opened context menu, use the "Assign macro" item to specify which macro to execute when the menu item is activated by the end-user. Select the Tidy macro (the one coming from the Tidy VBA sub procedure).
  17. Close the Customize dialog.
  18. You should now be up and running. Test that everything works.

4 Conclusion

I have shown you how you can integrate HTML Tidy into FrontPage and thereby improve the overall quality of your web documents in an easy manner.

The next obvious thing to implement would be an offline validator that can be executed from within FrontPage. That would really be something that would increase the quality of the web documents. I suggest reading my accompanying article "Integrating Validation into Microsoft FrontPage" which explains how you can add offline validation to your web page. It's as good as the validator available from http://validator.w3.org/.

Nice coding.

Appendix A: Known problems

A.1 DOCTYPE problem in FrontPage 2000

Within FrontPage 2000, a document containing the !DOCTYPE specification (typically the first line) may not be preserved after an execution of the Tidy macro.

This problem does not occur in FrontPage 2002.

Cause: Tidy is run on the ActiveDocument.DocumentHTML string which does not contain the DOCTYPE specification.

Resolution: Change to FrontPage 2002.

Comments: The ActiveDocument.DocumentHTML string may not contain all code from the web page, which may lead to other problems. One example is the text of the shared borders, which may contain HTML code. This HTML code will not be repaired by Tidy.

A.2 Possible XML declaration problem in FrontPage 2000

Recent HTML Tidy builds creates an XML declaration when 'output-xhtml' is set to 'yes'. This is, in fact, good. Older builds forgot the XML declaration, even when using other encodings than UTF-8 or UTF-16. I'm afraid this could cause big problems. Since FrontPage 2000 doesn't like XML declarations, it moves them into the body element.

A.3 Long file name handling

In case of long file names, e.g. if you have installed HTML Tidy in a folder with long file names (e.g. under C:\Program Files), the VB code should be changed to quote the file names.

In tidy.bas the following should be modified from:

Dim strCmd As String
strCmd = TIDY_PROGRAM_FILE & " -f " & TIDY_ERROR_FILE & _
" -config " & TIDY_CONFIG_FILE & " " & TIDY_TEMP_FILE

To:

Dim strQuote As String
Dim strCmd As String
strQuote = Chr$(34)
strCmd = strQuote & TIDY_PROGRAM_FILE & strQuote & " -f " & strQuote & TIDY_ERROR_FILE & strQuote & _
" -config " & strQuote & TIDY_CONFIG_FILE & strQuote & " " & strQuote & TIDY_TEMP_FILE & strQuote

Appendix B: Using TidyATL DLL

This appendix contains a preliminary description of how to use the HTML Tidy ATL/COM Wrapper DLL as implemented by Charles Reitzel (see http://users.rcn.com/creitzel/tidy.html#comatl). One of the advantages of using this is that you potentially can avoid all the temporary files.

Preliminary How To

  1. Download the TidyATL.zip, unzip it and save the TidyATL.dll somewhere e.g. in "C:\Program Files\Validator".
  2. Register the dll using the "regsvr32 TidyATL.dll" (from command prompt)
  3. Start FrontPage and open the VBA editor by activating Tools|Macros|Visual Basic Editor.
  4. In the VBA editor open the References Dialog by activating Tools|References.
  5. In the references dialog activate the "Tidy 1.0 Type Library" in the listbox by clicking the checkbox in the list of available objects/type libraries. This will inform VBA that you would like to use the TidyDocument in your VBA project.
  6. Create a new module (named it Tidy or, if that is already existing, give it a unique name of your choice).
  7. Enter the code as seen in the 'VB Module Code - Tidy' section below into the newly created module.
  8. Create a new class module and name it TidyDocEventClass (I guess you do not have this already?).
  9. Enter the code as seen in the 'VB Class Module Code - Tidy Doc Event Class' section below into the newly created class module.
  10. Now you can create a menu item (as previously explained in section 3.2 above) containing a macro call to the DoTidy procedure.

I still need to figure out to provide better message feedback. I would prefer a solution where all messages are outputted to a modeless window (dialog) visible to the FrontPage user and in which the user could click on the (error) line jumping directly into the source code on the position where the (error) problem were located.

VB Module Code - Tidy

Option Explicit

' Specifies path to Tidy configuration file...
Const TIDY_CONFIG_FILE = "C:\Program Files\Validator\tidy.cfg"

' *************************************************
' DoTidy
'   Tidy' the current active page in FrontPage
'   accordingly to the HTML Tidy configuation
'   specified in the configuration file defined
'   in the TIDY_CONFIG_FILE constant.
'
' Note: Tidy (error) message are captured in
'       the OnMessage event handler in the
'       projects TidyDocEventClass.
'
Sub DoTidy()

  Dim bFlipToHTMLSource As Boolean
  bFlipToHTMLSource = False

  If ActivePageWindow Is Nothing Then
    MsgBox "Please open a file in the Frontpage Editor.", vbOKOnly Or vbCritical
    Exit Sub
  End If

  If Not ActivePageWindow.ViewMode = fpPageViewNormal Then
    bFlipToHTMLSource = True
    ActivePageWindow.ViewMode = fpPageViewNormal
  End If

  Dim oFPdoc As FPHTMLDocument ' The FrontPage document (e.g. in HTML)
  Set oFPdoc = ActivePageWindow.Document

  Dim oTidyDoc As TidyDocument
  Set oTidyDoc = New TidyDocument

  ' Setup class to handle events (OnMessage) from the Tidy document
  Dim oEvtClass As TidyDocEventClass
  Set oEvtClass = New TidyDocEventClass
  Set oEvtClass.TidyDoc = oTidyDoc

  Dim nStat As Long

  On Error GoTo TidyError
  nStat = 0
  'If Len(sErrorFile) > 0 Then
  ' nStat = oTidyDoc.SetErrorFile(sErrorFile)
  'End If
  If nStat >= 0 Then
    nStat = oTidyDoc.LoadConfig(TIDY_CONFIG_FILE)
  End If
  If nStat >= 0 Then
    nStat = oTidyDoc.ParseString(oFPdoc.DocumentHTML)
  End If
  If nStat >= 0 Then
    nStat = oTidyDoc.CleanAndRepair()
  End If
  If nStat >= 0 Then
    nStat = oTidyDoc.RunDiagnostics()
  End If
  If nStat >= 0 Then
    oFPdoc.DocumentHTML = oTidyDoc.SaveString()
  End If

  If bFlipToHTMLSource Then
    ActivePageWindow.ViewMode = fpPageViewHtml
  End If

  Exit Sub

TidyError:
  MsgBox "Tidy could not execute correctly. No changes have been carried out." & _
          Chr(10) & "Error # " & CStr(Err.Number) & " " & Err.Description, vbOKOnly Or vbCritical
End Sub

VB Class Module - Tidy Doc Event Class

Option Explicit

Public WithEvents TidyDoc As TidyDocument

' *************************************************
' TidyDoc_OnMessage
'   Event handler (OnMessage) for TidyDocument.
'
Private Sub TidyDoc_OnMessage(ByVal level As TidyReportLevel, ByVal nLine As Long, _
  ByVal nCol As Long, ByVal sMsg As String)

  Dim sLevel As String, sLine As String

  If level = TidyInfo Then
    sLevel = "Info: "
  ElseIf level = TidyAccess Then
    sLevel = "Access: "
  ElseIf level = TidyWarning Then
    sLevel = "Warning: "
  ElseIf level = TidyConfig Then
    sLevel = "Config: "
  ElseIf level = TidyError Then
    sLevel = "Error: "
  ElseIf level = TidyBadDocument Then
    sLevel = "Doc: "
  ElseIf level = TidyFatal Then
    sLevel = "Fatal: "
  Else
    sLevel = "???: "
  End If

  If nLine > 0 Then
    sLine = sLevel & "Line " & nLine & "Col " & nCol & ", " & sMsg
  Else
    sLine = sLevel & sMsg
  End If

  MsgBox sLine
End Sub

Valid XHTML 1.0!