January 10, 2009

Cross-referencing multiple documents with keys

Filed under: Uncategorized — John Walkup @ 9:24 pm

Probably the most frustrating search this year has been my quest to find a good discussion on using keys in multiple source documents. Let me explain.

Keys are used to “tag” certain elements so that they can be referenced later. In a way, the <xsl:key> command functions like the \label{ } command in LaTeX. To call on a key (in LaTeX, that would be similar to the \ref{ } command), one uses key( ).

\section{Introduction}
\label{intro}

\section{Theory}

This is a reference to Sec.\ \ref{intro}.

So in essence:

<xsl:key> -> \label{}
key() -> \ref{}

Note that in LaTeX, the arguments of both \label{} and \ref{} are the same if they reference the same item. So if a section of text is labeled, say, “intro” then the label declaration would be \label{intro} and the call to the section would be \ref{intro}. LaTeX generates an automatic number and substitutes it for the \ref{} call when it compiles the code.

The way the <xsl:key> command treats its arguments differs. The first glaring difference is that the source code of XML does not contain the key definitions, unlike LaTeX. In LaTeX, it would be similar to using the \label{} and \ref{} as arguments of the compiler call, telling the compiler what parts of the document the \label{} command defines and where the \ref{} command is called. Just to define the \label{} command alone would require something like the following in LaTeX: $latex -label{intro} at section{1}, subsection{2} (This won’t work, so don’t try it).

Let’s look at a sample XSL document that marks up two XML source documents:

<xsl:variable name=”StatesLookup” select=”document(’key_state.xml’)”/>
<xsl:key name=”StateKey” match=”state” use=”@sid”/>
<xsl:template match=”customer”>
<xsl:variable name=”custstate” select=”@stateid”/>
<xsl:variable name=”custname” select=”.”/>
<xsl:for-each select=”$StatesLookup”>
<xsl:value-of select=”$custname”/> lives in <xsl:value-of select=”key(’StateKey’, $custstate)”/>
</xsl:for-each>
</xsl:template>

Here is the first (primary) XML source document:

<customers>
<customer stateid=”03″>Bo</customer>
<customer stateid=”01″>Ty</customer>
<customer stateid=”02″>Jo</customer>
</customers>

Here is the second XML document:

<states region=”Midwest”>
<state sid=”01″>Iowa</state>
<state sid=”02″>Utah</state>
<state sid=”03″>Ohio</state>
</states>

Line 1 of the XSL file simply defines a variable to represent a call to the secondary document key_state.xml. So whenever we see a call to $StatesLookup we will be opening key_state.xml and peering in.

Line 2 defines the key we plan to use when marking up the two XML documents. It is strictly defined for elements of the secondary document key_state.xml, so at this point it functions like a \label{ } command placed inside a LaTeX section. The key pertains solely to any text element belonging to the @sid attribute/state element. In a way, defining the key is like creating a physical key to open all drawers of the file cabinet marked “state.” Each drawer is labeled individually “01,” “02,” and “03.”

Now that we have a key generated that will retrieve any of the text elements in the <state> elements, we have to have certain elements within the first XML (primary) document call on this key. We can’t insert the key( ) call within an XML document; key( ) commands are only found in XSL documents. Therefore, the key( ) is going to have to reference some portion of the primary XML document, which is why the key( ) command has two arguments: the first tells us which key we are using, the second tells us which element of the primary document is calling on the key. In line 7 of the XSL document, the call

key(’StateKey’, $custstate)

tells the style sheet to choose the key called StateKey and apply it to elements lying within the $custstate node, which is defined in Line 4. Since the stateid attribute has particular values for each <customer> element, these values get mapped into the key, thus referencing the appropriate element in the secondary XML document.

And it won’t just do this for the first element: The <xsl:for-each> command will apply the same key mappings for all elements in the <customers> node. In fact, the <xsl:for-each> command serves two purposes: It ensures that each element is processed and it shifts the target of the operations to the secondary document.

Tying it all together

So what does all this mean?

The key( ) command accepts two arguments: The first identifies the key and the second receives some value which is passed to the third argument of the <xsl:key> command, which then retrieves the corresponding value.

Next, I will need to learn how to generate automatic indexes.

Keys in XSL

Filed under: Uncategorized — John Walkup @ 9:23 pm

My search for a utility that will allow me to reference one part of a document in another has zeroed in on the <xsl:key> command. Essentially, if for example I have a bibliography entry in one part of a document, and I need to reference that entry in another part, I need a utility that will permit it.

The easiest implemention I have found so far is a simplified version of the example found at http://nwalsh.com/docs/tutorials/xsl/xsl/foil68.html. It uses the following xml and xsl examples:

<?xml version=”1.0″ encoding=”UTF-8″?>

<doc>
<para>See <bibref>xyzzy</bibref>.
</para>
<biblio>
<bib abbrev=”xyzzy”>The Great Grue</bib>
<bib abbrev=”abcde”>That Alphabet Song</bib>
</biblio>
</doc>


<?xml version=”1.0″?>
<xsl:stylesheet version=”1.0″ xmlns:xsl=”http://www.w3.org/1999/XSL/Transform” xmlns:fo=”http://www.w3.org/1999/XSL/Format”>

<xsl:key name=”bibkey” match=”//biblio/bib” use=”@abbrev”/>

<xsl:template match=”doc”>
<xsl:apply-templates/>
</xsl:template>

<xsl:template match=”biblio”>
<!– suppressed –>
</xsl:template>

<xsl:template match=”para”>
<xsl:apply-templates/>
</xsl:template>

<xsl:template match=”bibref”>
<xsl:apply-templates select=”key(’bibkey’,string(.))”/>
</xsl:template>

</xsl:stylesheet>

Static and flow

Filed under: Uncategorized — John Walkup @ 9:22 pm

Okay, I am getting a little perturbed at XSL book authors, and I will tell you when I get to my Daily Gripes section. For now, let me just say that this latest gripe centers on the use of static and flow inside a document.

Here is the code I am working with right now, which does not work.

<xsl:template match=”/”>
<fo:root>
<!– layout-master (page setup) –>

<fo:layout-master-set>
<fo:simple-page-master master-name=”body”>
<fo:region-body region-name=”portrait” margin-top=”20pt”/>
<fo:region-after region-name=”xsl-region-after” extent=”20pt”/>
</fo:simple-page-master>
</fo:layout-master-set>

<!– page sequence (containts page content) –>

<fo:page-sequence master-reference=”body”>
<fo:flow flow-name=”portrait”>
<fo:block>
<xsl:apply-templates/>
</fo:block>
</fo:flow>
<fo:static-content flow-name=”xsl-region-after”>
<fo:block>
Text
</fo:block>
</fo:static-content>

</fo:page-sequence>

</fo:root>
</xsl:template>

Why doesn’t this work? Apparently, piling the static content into the footer (called “xsl-region-after” in the above code) after having flowed content into the region body (”portrait”) is a no-no. The compiler complains “Static content… is not allowed after <fo:flow>. Neither of the two books I am referring to currently mention this.

Instead, the page-sequence section of the code needs to be something like this:

<fo:page-sequence master-reference=”body”>
<fo:static-content flow-name=”xsl-region-after”>
<fo:block>
Text
</fo:block>
</fo:static-content>
<fo:flow flow-name=”portrait”>
<fo:block>
<xsl:apply-templates/>
</fo:block>
</fo:flow>
</fo:page-sequence>

Now why does the static have to come before the flow? Not so sure yet.

I have learned another lesson the hard way: Nothing is going to happen unless the XSL template specifies some flow. I initially thought I could mock up a blank PDF template with a filled footer by removing the <fo:flow> call altogether, but no dice. No flow equates to “This program cannot display the webpage.” (We can remove the content between the <fo:block> calls and that would work fine, but we cannot remove the FO blocks altogether. No work-ee.

More XSL page layouts

Filed under: Uncategorized — John Walkup @ 9:20 pm

Let’s go back to the earlier XSLT document from my previous blog. For the purposes of this discussion, we will focus on only one page master for brevity. Note that in the example below, I have set the top margin to 20 points in two different locations: one in the page master declaration, and the other in the region body declaration.

<xsl:template match=”sales”>
<fo:root>
<!– layout-master (page setup) –>

<fo:layout-master-set>
<fo:simple-page-master master-name=”body” margin-top=”20pt”>
<fo:region-body region-name=”portrait” margin-top=”20pt”/>
</fo:simple-page-master>
</fo:layout-master-set>

<!– page sequence (containts page content) –>

<fo:page-sequence master-reference=”body”>
<fo:flow flow-name=”portrait”>
<fo:block>
<xsl:apply-templates select=”truck”/>
</fo:block>
</fo:flow>
</fo:page-sequence>

</fo:root>
</xsl:template>

The effects of the two margin spacings are cumulative: the resulting document has a 40 point top margin spacing. What is going on here? And since only one region body can be declared within a given page master, why would XSL allow a margin spacing command to be placed in both? Which is preferable? Is there ever any difference between one and the other?

Evidently, I need to explore this issue further, but later. I have gotten away from my earlier goal of understanding the relationship between template matches and FO layouts.

Page layout in XSLT

Filed under: Uncategorized — John Walkup @ 9:19 pm

In many ways, XSL uses a fairly straightforward method for defining the manner in which text is “flowed” into a page. In other ways, I find it positively goofy (but more on that later).

Consider the following XSLT style sheet. I have dispensed with the usual XSLT boiler plate for brevity, so this snippet begins with the <xsl:template match=”table”> declaration (although that alone is a sore point with me, but more on that later.)

Also, I have not printed the contents of the corresponding XML page either, although I will mention that it contains a node element called <date>. At this point, we are not concerned about what is being printed, but rather how it is being processed. Note that there is one layout master defined (XSL won’t allow multiple layout masters), but two page masters.

<xsl:template match=”sales”>
<fo:root>
<!– layout-master (page setup) –>

<fo:layout-master-set>
<fo:simple-page-master master-name=”body”>
<fo:region-body region-name=”portrait” margin-top=”9pt”/>
</fo:simple-page-master>

<fo:simple-page-master master-name=”appendix”>
<fo:region-body region-name=”landscape” margin-top=”5pt”/>
</fo:simple-page-master>
</fo:layout-master-set>

<!– page sequence (containts page content) –>

<fo:page-sequence master-reference=”body”>
<fo:flow flow-name=”portrait”>
<fo:block>
<xsl:apply-templates select=”truck”/>
</fo:block>
</fo:flow>
</fo:page-sequence>

<fo:page-sequence master-reference=”appendix”>
<fo:flow flow-name=”landscape”>
<fo:block>
<xsl:apply-templates select=”date”/>
</fo:block>
</fo:flow>
</fo:page-sequence>

</fo:root>
</xsl:template>

I called one page master “body” (that is, the main body of the paper) and the other “appendix,” which correspond to two sections of a large-scale document that could require their own special treatment. I also called one region body “portrait” and the other “landscape,” although I am not sure why anyone would want an appendix typeset in landscape mode. Note that I have not declared anything in the “landscape” region body that would typeset it in landscape form, at least not yet.

From what I can tell so far, declaring a page master is similar to selecting certain attributes in the File/Page Setup command in Microsoft Word, or creating text boxes within Adobe InDesign. (I don’t think LaTeX has a similar functionality in this regard.)

The result of this transformation is to create two pages, with the contents of the <truck> element of the XML file spilling into the “body” page master on the first page and the “appendix” page master on the second page.

Now, here is what I don’t fully understand: Since page masters cannot contain more than one region body, then what is the purpose of naming the region body? Would the name of the page master not suffice since it has to be unique within the XSL template? In the examples I have seen in books, authors are ascribing the same name to the region body as to the page master, which seems only natural, but also superfluous.

Since layout masters can contain more than one page master, then why do authors avoid introducing this added sophistication in their books at the time the concept of the page master is being developed? To me, showing a layout master with two page masters, with each page master having a unique

Gripe of the day

I realize that computer book authors like to show off bells and whistles, but too much of it can get out of hand. For example, in the above code I only showed one change in the style format for each region body. Sure, I could have set the left margins, right margins, bottom margins, and so on, but what would that teach? I suggest showing one bell or whistle, then referring the reader to the others. This keeps the code simple and saves space.

Luddites and LaTeX

Filed under: Uncategorized — John Walkup @ 9:15 pm

The Luddite movement of the early 1800s offers lessons on company management. The increased use of modern textile machinery compelled some to form militant gangs opposed to their use, with the result of smashed equipment, beatings, and executions. The Luddites were wrong — there was, and still is, room for skilled craftsmen in today’s society. But given the environment they faced at the turn of the nineteenth century, it is easy to understand their fears. With all of their talent concentrated in a single skill (making textiles), their fears of being shoved aside were easy to understand.

It is never easy parting with that you feel comfortable or skilled doing. For many years (close to 15) I have been a staunch advocate of the typesetting software LaTeX, becoming quite proficient in programming in its environment to produce professional looking documents. LaTeX is an incredibly powerful typesetting utility that has its uses, especially in scientific and mathematical publication. Our first two state-wide contracts were garnered with proposals created in LaTeX, but it will not serve our needs at The Standards Company LLC much longer, except for special cases.

The software at The Standards Company is primarily based in XML, and that is the direction our company is pointed. The stylesheet language XSL offers many advantages over LaTeX, but none more important to my typesetting goals than its ability to post up raw data in a uniform, dynamic manner. So while LaTeX is what I am comfortable using, it is time to move on.

I am not altogether new to XML (my Director of Research, Ben Jones, lives and breathes it on an hourly basis), but typesetting XML documents using its style sheet companion XSLT is not something I am used to. But if that is what the company needs, then “very well.” To do otherwise is to apply for membership to the Luddites, an option our company does not offer.

Knowing LaTeX has helped me learn to typeset in XSLT. My experience with Adobe InDesign has helped too. Both incorporate the same structure — the use of box and inline structure to flow text into pre-defined regions of the page.

You may be wondering about our company’s position on Microsoft Word. Well, Word is great for letters and small documents. That’s about it. For a grant or RFP response, we would just as soon use a hammer and chisel.

In the future, I plan to post some of my own tutorials on XSLT as I learn it, especially the way this style/markup language relates to LaTeX and InDesign.

Gripe of the day

Example XML documents appear in every tutorial, but authors are often careless in their creation of example XML documents. One problem I see is to use sample XML code as text entries in the example. (See Listing 9.1 in Lovell, D. XSL: Formatting Objects, SAMS, 2003.)

Another mistake is to create XML example documents that are too long and complex. Not only does this make discerning elements from the document difficult, it takes a long time to type in the example text in one’s own files. I realize that some of you XSL authors like Shakespeare, but is there a real good reason to use a three-page XML example of a Shakespearean sonnet when a half-page example will do?

To me, the best example XML files are ones that have limited text, but just enough sophistication in the hierarchy to give flexibility to the discussion. For example:

<sales year=”1980″>
<car>
<ford plate= “2XMP”> Taurus </ford>
<chevy plate= “3YE”> Cavalier </chevy>
<dodge plate=”8EY”> Charger </dodge>
</car>
<truck>
<ford plate= “1YU”> Bronco </ford>
<chevy plate= “1TY”> Tahoe</chevy>
<chevy plate= “2ZX”> Tahoe</chevy>
</truck>
</sales>

This example XML file has all the sophistication one needs to show off the capabilities of XSL, and none of the elements can be confused with XML or XSL nomenclature. (Note: I had used “Silverado” earlier, but replaced it with “Tahoe” to save space. I also trimmed the plate numbers to three characters. The smaller the sample text, the better.) Note the repeating <chevy> element under the <truck> parent. Note also that the <car> and <truck> elements share the same child elements.

Collaboration Schemes

Filed under: Uncategorized — John Walkup @ 9:13 pm

Ben Jones is the Director of Research for The Standards Company LLC. He mentioned to me in passing that an interesting phenomenon was occurring within the confines of the company that he recognized in a book Extreme Programming, one of the pocket guides from O’Reilly.* He specifically pointed out a chapter titled “Developer Practices,” specifically “Developer Practice 2: Pair Programming.” The premise of this subsection of the book is that learning increases when working in pairs on a common goal, and that such learning can spread quickly to the rest of the staff when new partnerships are formed. Although the book is aimed primarily at programmers, learning is learning (which just goes to show that we can all learn a great deal from those working in disciplines outside our own fields).

Program in pairs. When you start a task, ask another developer to work with you. Pairs generally work together for just one task, perhaps an entire afternoon, and then form other pairs with new partners. This spreads the knowledge of the system throughout the whole team.

But there is more to it than that.

The person with the keyboard –- the driver –- focuses on the details of the task. He thinks tactically. The other person –- the navigator –- keeps the entire project in mind, ensuring that the task fits into the project as a whole and keeping track of the team’s guidelines. She thinks strategically. Both roles are important, and both roles are fluid. When inspiration strikes you, drive. Your partner will navigate. Change roles as necessary.

Our collaborative teams are not quite as systematic. But maybe they should be. Regardless of whether you are computer programming or analyzing student assignments (and The Standards Company does plenty of both), the pair-programming structure appears to be a highly effective way to bring new staff members up to speed and to invigorate all staff members towards innovation. We are going to give pair programming a hard look over the next few weeks and judge its efficacy. Stay tuned.

All excerpts are from Extreme Programming Pocket Guide, O’Reilly & Associates: 2003.

* Not everyone is completely enamored with the concept of extreme programming. Here is computer guru Don Knuth: “With the caveat that there’s no reason anybody should care about the opinions of a computer scientist/mathematician like me regarding software development, let me just say that almost everything I’ve ever heard associated with the term “extreme programming” sounds like exactly the wrong way to go…with one exception. The exception is the idea of working in teams and reading each other’s code. That idea is crucial, and it might even mask out all the terrible aspects of extreme programming that alarm me.

Bloom’s Taxonomy and Student Engagement

Filed under: Uncategorized — John Walkup @ 9:12 pm

A trip into a classroom last year reminded me of an important ingredient of quality teaching. My memory is fuzzy, but I recall the grade level of the children as fifth grade. I remember the lesson quite well, however. The teacher was developing the concepts of median, mean, and mode and teaching the students how to perform the computations. The teacher was, for the most part, using a lot of research-based strategies during her lesson, which is what I was primarily seeking and measuring. However, many of the students were barely paying attention, and one of the students (I will call him “Thor”) was not only off-task but was pestering nearby students.

I sidled up to the teacher and told her that I was curious if students had a preference for one of the three statistical measures over the others. Which of the three – mean, median, or mode – do they consider the easiest to calculate, and why? (Naturally, the next question would be, “Which is the hardest, and why?”) I asked her if she would mind asking the question and putting the students into small peer groups to discuss their answers.

The results were magical. Even Thor dived into the discussion. “Pick the mode, Stupid! All you have to do is look for the number that shows up the most!” Although noisy, the class was filled with discussions on academic content. Furthermore, many students were re-reading their notes.

The key to classroom engagement and differentiating instruction is Bloom’s Taxonomy. By asking a higher-order (evaluation-level) question and prompting students to work in small groups, the teacher was able to engage every student in the class on lesson content. Gifted students were given a chance to lead the small-group discussions; weaker students learned the material by discussing it with their peers.

In my opinion, insufficient number of questions and activities centered on higher-order thinking skills is the most significant cause for low student achievement and classroom management problems.

Cognitive Subversion

Filed under: Uncategorized — John Walkup @ 9:07 pm

I love listening to Jennifer Elkins and Gerlinde Olvera banter. Jennifer and Gerlinde form two of the Team Leaders in science and mathematics at the company, which has decided that providing researchers plenty of free time to discuss complex educational issues in detail is greatly fruitful. For the past two weeks, they have been discussing (and arguing) over some of the finer points of cognitive rigor, a framework for analyzing student work. Sometimes I jump in. One of our recent discussions centered on a subtle element of cognition that greatly affects teaching, a (sometimes) undesirable behavior we call <i>cognitive subversion</i>. Jennifer and Gerlinde have a detailed account of it in their own blog, but I saw a real-life account of it in a Southern California classroom.

A teacher was asking her students to find the area of a right triangle. All that was known was the length of the hypotenuse and that one of the angles measured 30 degrees. This is a complex activity, requiring that students first recognize that placing two of these triangles next to each other forms an equilateral triangle. There are quite a few steps that follow, and I won’t go into them in detail. But I will mention that this activity aligns to a depth-of-knowledge (DOK) level of 3.

However, the teacher began providing the students hints. The first hint she mentioned was, “Think about placing two of the triangles next to each other.” By providing hints to each step, the teacher was replacing a DOK-3 level activity with a succession of DOK-1 (recall) level steps.

Hints are often helpful, especially to weaker students. But hints can also subvert the cognitive rigor of an assignment, transforming a higher-order thinking activity into a series of low-level steps. We call this “cognitive subversion.”

There is a time and place for providing hints. And as objective observers, we are mindful that the strategies used by teachers can often be difficult to judge as beneficial without understanding the classroom environment. But teachers need to be aware of <i>why </i>they do the things they do (a process called metacognition). By being well-informed of what constitutes Bloom’s taxonomy and depth-of-knowledge, teachers can provide hints as part of a well-informed strategy, rather than based only on gut feelings.

More on vocabulary development

Filed under: Uncategorized — John Walkup @ 9:04 pm

In my last blog, I discussed how time that is often lost during the school day could be used to develop students’ vocabulary. I want to talk about vocabulary development a little more.

To many people, the term “vocabulary development” means “teaching students the meaning of new words.” That is only partially correct. For students to be taught vocabulary effectively, they need to not only know the meanings of the new words, but also be comfortable in their use. A true vocabulary is a working vocabulary; otherwise we run the risk of preparing students for Jeopardy, not real-world life skills.

Therefore, for students to have really learned a new word, they must have learned how to use the word in their everyday experiences. They not only need to recognize the word when they see it written and hear it spoken, they also need to be comfortable saying it and writing it. For a full vocabulary development approach, we need to employ at least four strategies:

1. Compelling students to say the word until they are comfortable pronouncing it.
2. Stating words out loud and asking students to listen carefully to its correct pronunciation.
3. Making students write the word until they consistently spell it correctly.
4. Writing the word on the board and pointing out its spelling intricacies.

Consider the word “segue.” If a student is not sure he knows how to spell the word correctly, then he will simply not write it and substitute a more comfortable word or phrase. In my opinion, “segue” is not truly in the student’s vocabulary. And if the student is not comfortable pronouncing the word, then the student is less likely to read more challenging material.

In summary, students need to be taught to recognize words in print, to say the words out loud, to write the words on their own, and to recognize words when they hear them.

This takes time, of course. Therefore it is important that teachers be careful in deciding which vocabulary words to choose to develop (and this is where Marzano’s word lists come in handy) and ensure that they are using as much of their allotted time as possible for true academic instruction.

« Newer PostsOlder Posts »

Powered by WordPress