LinuxDevCenter.com

oreilly.comSafari Books Online.Conferences.

We've expanded our Linux news coverage and improved our search! Search for all things Linux across O'Reilly!

Search
Search Tips

advertisement

Listen Print Subscribe to Linux Subscribe to Newsletters

The Writer's Workbench
Pages: 1, 2

Checking text for readability

The style command analyzes the writing style of a given text. It performs a number of readability tests on the text and outputs their results, and it gives some statistical information about the sentences of the text.



Give as an argument the name of the text file to check. For example, to check the readability of the file banquet-speech.txt, you'd type:

$ style banquet-speech.txt RET

Like diction, style reads text from the standard input if no text is given.

The various readability formulas that style uses and outputs are as follows:

  • The Kincaid Formula, originally developed for Navy training manuals, a good readability for technical documentation;
  • the Automated Readability Index (ARI);
  • the Coleman-Liau Formula;
  • the Flesh reading easy formula, which gives an approximation of readability from 0 (difficult) to 100 (easy);
  • the Fog Index, which gives a school grade reading level;
  • the WSTF Index, a readability indicator for German document; and
  • the Wheeler-Smith Index, Lix formula and SMOG-Grading tests, all readability indicators which give a school grade reading level.

The sentence characteristics of the text which style outputs are as follows:

  • Number of characters
  • Number of words, their average length, and average number of syllables
  • Number of sentences and average length in words
  • Number of short and long sentences
  • Number of paragraphs and average length in sentences
  • Number of questions and imperatives

Finding difficult sentences

To output just "difficult" sentences of a text, use the -r option followed by a number; style will output only those sentences whose ARI readability index is greater than the number you give.

For example, to output all sentences in the file banquet-speech.txt whose readability is greater than a value of 20, type:

$ style -r 20 banquet-speech.txt RET

Displaying long sentences

You can use style to output sentences longer than a certain length by giving the minimum number of words as an argument to the -l option.

For example, to output all sentences longer than 14 words in the file banquet-speech.txt, type:

$ style -l 14 banquet-speech.txt RET

Spelling

Two additional commands that Walker says were part of the Writer's Workbench have long been standard on Linux: look and spell. Both tools work on the system dictionary file, /usr/dict/words. This file is nothing more than a word list (albeit a very large one), sorted in alphabetical order and containing one word per line. Words that are correct regardless of case are listed in lower-case letters, and words which rely on some form of capitalization in order to be correct (such as proper nouns) appear in that form.

The look tool outputs words in the system dictionary that begin with the text you give as an argument. It's useful for checking to see which words begin with a particular phrase or prefix.

For example, to list all the words in the dictionary that begin with the text "homew", you'd type:

$ look homew RET

This command will output words such as "homeward" and "homework."

When you're unsure whether or not a particular word is spelled correctly, use spell to find out. It reads from the standard input and outputs any words that don't appear in the system dictionary file -- so if a word is potentially misspelled, it will be echoed back on the screen after you type it.

For example, to check if the word "occurance" is spelled correctly, you'd type:

$ spell RET
occurance RET
occurance
^D
$

In this example, spell echoed the word "occurance" after it was typed, meaning that this word was not in the system dictionary and therefore was likely a misspelling. A Control-D was typed to exit spell and return to the shell prompt.

Next week: How to make and manage documents with SGML-tools.

Michael Stutz was one of the first reporters to cover Linux and the free software movement in the mainstream press.


arrowMore Living Linux articles.




Tagged Articles

Be the first to post this article to del.icio.us

Sponsored Resources

  • Inside Lightroom
Advertisement

Sponsored by:

O'Reilly Media

©2009, O'Reilly Media, Inc.
(707) 827-7000 / (800) 998-9938
All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners.
About O'Reilly
Academic Solutions
Authors
Contacts
Customer Service
Jobs
Newsletters
O'Reilly Labs
Press Room
Privacy Policy
RSS Feeds
Terms of Service
User Groups
Writing for O'Reilly
Content Archive
Business Technology
Computer Technology
Google
Microsoft
Mobile
Network
Operating System
Digital Photography
Programming
Software
Web
Web Design
More O'Reilly Sites
O'Reilly Radar
Ignite
Tools of Change for Publishing
Digital Media
Inside iPhone
O'Reilly FYI
makezine.com
craftzine.com
hackszine.com
perl.com
xml.com

Partner Sites
InsideRIA
java.net
O'Reilly Insights on Forbes.com