Welcome, Readers of the Future

I'm working on the next edition of Sams' Teach Yourself Java in 24 Hours. Java 9 has a new HTTP client package, jdk.incubator.http, that makes it a lot easier to GET and POST to web servers and other software that communicates over HTTP.

For a demo, I needed a simple server that could take POST requests and do something with them without requiring a user login. I was about to write one when I realized I already had. This blog takes comments submitted over POST.

When the book comes out, I'll be able to see from these comments that readers have reached Hour 22.

Teach Yourself Java 8 in 24 Hours

Cover to Sams Teach Yourself Java in 24 Hours, 7th EditionI'm working on author review today for the Java 8 edition of my book Sams Teach Yourself Java in 24 Hours. This is the phase of the project near the finish line where I get all the chapters back as edited Word documents, review the changes recommended by editors and answer any questions they have. I also give each chapter a quick read and make sure the code compiles. (I hate it when a computer book has code that doesn't compile.)

One of the things I like about writing a 24 Hours book is that my publisher, Pearson Education, gives me license to have some fun with the material. Computer books can be as dry as Ben Stein teaching high school economics if you don't liven them up.

I ended chapter 2 with this passage:

During this hour, you got your first chance to create a Java program. You learned that to develop a Java program you need to complete these four basic steps:

1. Write the program with a text editor or a tool such as NetBeans.

2. Compile the program into a class file.

3. Tell the Java Virtual Machine to run the class.

4. Call your mother.

Along the way, you were introduced to some basic computer programming concepts such as compilers, interpreters, blocks, statements, and variables. These will become clearer to you in successive hours. As long as you got the Saluton program to work during this hour, you're ready to proceed.

(The fourth step has nothing to do with Java programming. It's just something my mother suggested I put in the book.)

The book comes out May 23. Supplies are limited to however many we print.

Converting a WordPress Blog to HTML Files

WordPress logo tilted to the left

I've been doing more programming lately, primarily in Java because I am writing several books that teach the language. I have a few big announcements coming soon about those projects.

My current coding effort is an application that turns a no-longer-updated WordPress blog into a set of static HTML pages. The goal is to make it easier to retire a blog while keeping the content available in the form that's most likely to be future proof and extremely simple to move around.

WordPress can export a blog's pages, entries and comments to a single XML file. The export file is an RSS feed extended with several namespaces, which the company has dubbed WordPress eXtended RSS (WXR). To create a WXR file of your blog, go to your WordPress dashboard and choose Tools, Export. A page opens with an Export command that creates the file and initiates the download to your computer.

Although the WXR format isn't documented, any programmer who has worked with RSS feeds can figure out the purpose of most elements just by looking at an export file in a text editor.

I could use some guinea pigs, so if you have a WordPress blog and are willing to share its WXR file, I can send a copy back to you as a static web site. Send me an email or comment and we'll arrange how to get the file to me.

Kim Polese is a Cautionary Tale?

Photo of Kim Polese by Dan Farber

There's a dreadful sexist commentary on Forbes magazine today by Eric Jackson that suggests early Java executive Kim Polese caused herself to be wildly overhyped and the same mistake could be happening today to Facebook chief operating officer Sheryl Sandberg.

Under a headline that dubs each woman a Silicon Valley "It Girl," Jackson makes comparisons between the two women that all relate to gender, aside from flimsy observations that "they both like(d) magazine covers and editorial spreads" and "they both get (got) ranked on different arbitrarily created rankings of important people/power lists by business publications."

Making matters worse, he doesn't know his '90s dot-com history.

I was following Java closely in 1997 as I cowrote my first edition of Teach Yourself Java in 21 Days.

Polese wasn't a poor choice for Time magazine's 25 most influential people that year. The Java language had exploded in popularity since being included in the Netscape Navigator browser two years earlier. Polese and three core members of the team that created Java at Sun Microsystems, Arthur van Hoff, Jonathan Payne and Sami Shaio, all left together to start Marimba. It was widely viewed as the first hot startup to build on the technology, which was as big in Silicon Valley then as social networking is today. Polese had named Java, served as its product manager and became one of its best-known evangelists.

If you're a list-making journalist looking for somebody associated with Java to be the face of the trend, Polese was one of the top choices.

"History doesn't remember Kim Polese so well," Jackson claims, but Polese led Marimba to its public offering in 1999 and profitability. In 2004, the company was sold to BMC Software for $239 million. Though that's not as successful as people expected it to be in the heady days of the launch, it wasn't a flop. Any software startup that rode out the dot-com bust and sold for nine figures afterwards was doing something right.

Jackson writes that her next venture, SpikeSource, "seems to have quickly gone out of business." The company launched in 2003, Polese became the CEO a year later and it was in business for six years as an open-source infrastructure developer before folding.

So that she doesn't end up a sad "cautionary tale" like Polese, Sandberg should get back to her job and stop accepting so many awards and magazine covers, Jackson advises:

Maybe you should tone down the public appearances for a while and just keep your head down at Facebook. That doesn't mean do no public appearances or keep your light beneath a bushel. It just means to keep a balance more between the internal job and external stuff. There will always be some new Fortune Magazine cover to do, or award for being the most powerful woman executive in the world to accept.

(Any Facebook exec who thinks magazine covers will "always" be there should call the people running MySpace -- if their phones still work.)

Eric Jackson is a cautionary tale on how not to write about women in tech. Since publishing the commentary, he has deleted a line about how Sandburg's husband is "super-smart to boot" and another telling her to "just keep your head down at Facebook" -- without noting the edits were made. People are lining up on Facebook and Twitter to work him over.

The same year Polese made the Time 25, I taught myself Java and applied for a job at Marimba.

I'm still waiting to hear back.

Related links:

Credit: The photo of Kim Polese was taken by Dan Farber and is available under a Creative Commons license.

Teach Yourself Java While Still at the Bookstore

Out of thousands of comments made about the PAC expenditure story, this one on Balloon Juice is my favorite:

Roger Cadenhead, who posted this, is someone who has churned out a large number of computing books, many with titles like Sams Teach Yourself Java 2 in 24 Hours or Sams Teach Yourself Java 2 in 21 Days. As a software engineer, these titles make me doubt Cadenhead’s credibility. It might-just-be possible to learn a substantial amount of Java in 21 days (it is a very large language once one counts the libraries), but I don't know any non-trivial computer language in which most people can be fluent in less than six months.

As the author of more than a dozen Teach Yourself Subject in Refreshingly Short Time Period books, I occasionally get sent the link to Google director of research Peter Norvig's essay Teach Yourself Programming in 10 Years and Abstruse Goose's comic strip on the easiest way to teach yourself C++ in 21 Days.

I'm currently working on Sams Teach Yourself C++ in 24 Hours, so these guys are hitting me right in the meal ticket.

The official reason for the titles of these books is that each chapter is designed for readers to accomplish in that time period. So if you read Sams Teach Yourself Java in 24 Hours, and I strongly believe that you should, you can read each chapter and complete its projects in an hour. The same goes for Sams Teach Yourself Java 6 in 21 Days, but you get one day for each chapter because the material is harder. Whether you complete these books in 24 consecutive hours or 21 consecutive days -- or space it out and take breaks -- is up to you.

The unofficial reason for the titles? If I called my next book Teach Yourself C++ in 10 Years, it would sell as well as Lose Weight by Watching Your Diet and Exercising Regularly and Become a Millionaire by Working Hard for 40 Years and Saving Your Money.

The Norvig essay ends with a line that my publisher should use on the next edition: "[G]o ahead and buy that Java book." -- Peter Norvig, Google

W3C Serves 130 Million XML DTDs Per Day

A Java application I wrote that reads several dozen RSS feeds started running into trouble with the W3C. Feeds failed with HTTP 503 "Service Unavailable" errors like this one:

Server returned HTTP response code: 503 for URL: http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd

At first I thought this was a temporary error. HTTP 503 errors are defined to indicate that a server is temporarily overloaded or undergoing maintenance.

However, the W3C Systems Team announced in February 2008 that they were dealing with so much traffic for their XML DTD files that they were using 503 errors to deal with bandwidth-hogging XML clients that request the files too often:

... we receive a surprisingly large number of requests for such resources: up to 130 million requests per day, with periods of sustained bandwidth usage of 350Mbps, for resources that haven't changed in years. ...

A while ago we put a system in place to monitor our servers for abusive request patterns and send 503 Service Unavailable responses with custom text depending on the nature of the abuse. Our hope was that the authors of misbehaving software and the administrators of sites who deployed it would notice these errors and make the necessary fixes to the software responsible.

But many of these systems continue to re-request the same DTDs from our site thousands of times over, even after we have been serving them nothing but 503 errors for hours or days.

Although the problem went away for reasons I don't yet understand, I'm looking for a way to read local copies of the XML DTDs with the XOM Java XML library. XOM doesn't yet support XML Catalogs, an XML standard for handling this kind of issue.

Real-Time Twitpic Images Coming from Chile

When news breaks such as today's massive earthquake in Chile, one of the first places where images show up from the scene is on Twitpic, a popular image-posting service for Twitter users. You can find links to these images on Twitter search by including "twitpic" as one of your search terms, but that's not as useful as seeing thumbnails of the actual images. You have to click each link to see what it contains.

To make it easier to see the images being posted about Chile, I wrote a Java application this morning that uses the Twitter and Twitpic APIs to download thumbnails and display them in reverse chronological order. Each thumbnail can be clicked to open the photo on Twitpic's site.

The application produces a web page and RSS feed, updated every two minutes.

The application is a mashup that does the following:

  1. Downloads a search feed of "twitpic chile" from Twitter in Atom format.
  2. Extracts Twitpic URLs from tweets in that feed.
  3. Calls the Twitpic API to get the thumbnail of each photo as a JPEG image.
  4. Saves the thumbnail to a directory on my server.
  5. Produces a web page and feed of the saved thumbnails, sorted using the creation time of its file.

I'll be releasing it under the GPL when it's done. This application includes support for PubSubHubbub, so subscribers can see new photos show up in real time with clients such as Google Reader.

Book Giveaway: Teach Yourself Java in 24 Hours

My newest book, Sams Teach Yourself Java in 24 Hours, Fifth Edition, recently hit bookstores. The book is a for-absolute-beginners guide to programming Java, and this section from chapter one's Q&A section shows how much license I get from the publisher to have fun with the series:

Q. Do you only answer questions about Java?

A. Not at all. Ask me anything.

Q. Okay, why is Prince mad at the Foo Fighters?

A. Prince is unhappy that the Foo Fighters performed a cover of his song "Darling Nikki" and released it as a B-side single in Australia. He told Entertainment Weekly they should write their own tunes and wouldn't let the band release it in the United States. This became a pretty meaningless distinction as the song became a radio hit around the globe and was played regularly during their concerts.

When Prince performed at Super Bowl XLI a few years later, he covered the Foo Fighters' "Best of You," an artistic decision that surprised the Foo Fighters as much as everybody else.

"It was pretty amazing to have a guy like Prince covering one of our songs," Foo Fighters drummer Taylor Hawkins told MTV, "and actually doing it better than we did."

Although playing someone else's music is an odd way to exercise a grudge, this was a better option for the 5-foot-2 Prince than challenging the band to a fight.

Every chapter ends with one reader question that has bupkiss to do with Java. I used to be the Fort Worth Star-Telegram's Ed Brice, an answer man who fielded random questions, so old habits die hard.

Sams Teach Yourself Java in 24 Hours, Fifth EditionMy book has been fully updated for Java 6 and has new chapters on JAX-WS and game programming. I have 20 copies I'd like to give to people who want to learn Java, and there's still time for me to mail them before Christmas.

If you know someone who wants to learn Java, or you can make a convincing case for why Santa owes you this book after the year 2009 you just endured, please leave a comment here on Workbench or in a Twitter post to rcade. Make sure I have some means of contacting you, so I can get the address of the person getting the book.

I'm planning on mailing these out on Wednesday morning in the pre-Christmas scrum at the post office. I will mail the books directly to the people receiving them and can put your name and address as the sender and wrap them if necessary. No one needs to know I was involved.

Please note that I'm expecting the people who get this free book to teach themselves Java in a single contiguous 24-hour period. For too long, Sams has coddled readers who devote one hour a day to a subject and learn it at their leisure.

Kickin' It Old School with Microsoft Word 97

Windows 98 Microsoft Channel BarI began a new book this week on Java programming for beginners. I haven't been doing much computer book writing for a couple years, so I no longer had an installed copy of Microsoft Word 97, the version of the software my publisher uses to draft manuscripts. Word 2007 can save files in 97 format, but it doesn't support the publisher's custom styles, so I decided to install Word 97 on Vista.

Huge mistake.

Word 97 appeared to install properly, but when I installed some other Microsoft software afterward, it removed files that Word 97 requires to run. Now the program reports a registry error every time it runs and Vista won't uninstall it or install a new copy.

After considering other options, I installed a trial version of VMware Workstation, $188 software that creates virtual computers in which you can run other operating systems. You run the simulated computer in its own window after deciding how much disk space and memory to allocate to it, and it acts like it's an entire computer. After setting up one of these virtual systems, you can clone it, suspend it and run it remotely over the Internet.

Using VMware, I created a new virtual Windows XP system where I can run Word 97 and the other software required to write my book. As far as I know, this Pinocchio virtual computer thinks it's a real PC.

Because Microsoft is run by sadists, I had to install Windows 98 before I could install a Windows XP upgrade. It was weird to step back in time and see the Microsoft channel bar, an early stab at web syndication that predated RSS. During installation, Windows 98 also touts its support for USENET newsgroups. Kids today don't know how good they got it. In my day, if we wanted to see celebrities naked, we had to know how to UUdecode.

If anyone has any experience with VMware, I'd like to hear how well it works. My biggest concern is whether anything I do inside the virtual computer can adversely impact the real Vista system it runs on. I want virtual computers that I can destroy with impunity by running buggy beta software and other dodgy programs that don't get along with each other. I end up doing that a lot in the course of writing a book.

Following Web Page Redirects with Java

CNET moved a bunch of its blogs to a different domain this weekend, including Beyond Binary, Coop's Corner, Geek Gestalt, One More Thing, Outside the Lines and The Social. I mention this because the change hosed Meme13, which treated all six as if they were newly discovered sites.

One of my ground rules for developing Meme13 is that I won't hand-edit the site to make it smarter. I need the application to recognize when existing sites in its database have moved.

Meme13 monitors sites using a Java application I wrote that downloads web pages with the Apache HTTPClient 3.0 class library. Web servers indicate that a page has moved by sending an HTTP redirect response of either "301 Moved Permanently," which indicates a permanent move, or "302 Found," which is intended for temporary changes. I wrote a Java method that can find the current location of a web page, even if it has been redirected one or more times:

public String checkFeedUrl(String feedUrl) {
    String response = feedUrl;
    HttpClient client = new HttpClient();
    HttpMethod method = new HeadMethod(feedUrl);
    method.setFollowRedirects(false);
    try {
        // request feed
        int statusCode = client.executeMethod(method);
        if ((statusCode == 301) | (statusCode == 302)) {
            // feed has moved
            Header location = method.getResponseHeader("Location");
            if (!location.getValue().equals("")) {
                // recursively check URL until it's not redirected any more
                response = checkFeedUrl(location.getValue());
            }
        } else {
            response = feedUrl;
        }
    } catch (IOException ioe) {
        response = feedUrl;
    }
    return response;
}

The HeadMethod class requests a web page's headers instead of requesting the entire page, consuming far less bandwidth as it checks for redirects. My Java method looks for both kinds of redirects, because web publishers have a bad habit of using "302 Found" when they've moved a page permanently.

Exporting a Manila Site Using OPML

The RSS Advisory Board site now includes all of the articles, weblog entries, and comments from the group's old Manila site, dating back to the group's founding in 2004.

I never got a copy of the old site's root file from Harvard, so I collected the content using an obscure but cool feature of Manila: All site content is saved in the discussion board as individual messages, each of which can be downloaded as an OPML file. For example, open this weblog entry from Craig Burton's Manila blog in OPML format.

I wrote a Java application that used Apache HttpClient to download the files and XOM to process the OPML.

OPML sucks, but I got thousands of weblog files into a MySQL database so I can't complain. Manila stores message text in the text attribute of outline elements, some of which may be nested. Weblog entries are formatted using the most insane thing I've ever seen in an XML dialect:

<outline text="&lt;newsItem&gt;"/>
<outline text="&lt;title&gt;Hackers selling IDs for $14, Symantec says&lt;/title&gt;"/>
<outline text="&lt;url&gt;&lt;/url&gt;"/>
<outline text="&lt;/newsItem&gt;"/>

You need to be an XML dork to appreciate this, but it's XML elements stored as escaped markup inside XML attributes.

Sun Sees the Light on Java Applets

I'm working on the next edition of Sams Teach Yourself Java in 21 Days, an 800-page monster that will cover Java 6 so thoroughly that all the other Java authors will stop writing their books and pursue retraining for a non-technical profession. (Computer book authors should talk smack like rappers. One of these days I'm going to start an East Coast/West Coast feud with Seattle's Glenn "PC-Diddy" Fleischman.)

Ten years ago, the original edition of Java in 21 Days made a big deal out of Java applets, web-based programs that were the world's first exposure to the language. The first Java boom was sparked by then-Netscape executive Marc Andreesen's decision to add a Java interpreter to the Navigator browser.

As the years passed, the world realized that an applet is a terrible thing to do to a web browser. Even today, with five iterations of Java to improve performance, you can tell when a page contains an applet: Your hard drive starts spinning furiously as the Java Plug-in loads and there's an interminably long pause before the page displays. Fortunately for authors like me, Java found a better niche in servlets, mobile devices and enterprise applications.

The next edition of my book relegates browser applets to an appendix. By the time Java 7 rolls around, I may dump the subject entirely.

Need more proof that applets are dead? If you go to Sun's Java.Com homepage, you may see a cool demo of a Fast and the Furious: Tokyo cell phone game that's written in Java.

The demo loads quickly and incorporates fast-moving graphics synchronized perfectly with sound. When I saw it, I was so impressed that I dug into the page's source code, wanting to find out how Sun accomplished such great effects using an applet.

The answer: They wrote it in Flash.

NoWomenJustMen: The Roster at Most Tech Conferences

I heard from one of the organizers of the Spring Experience, an enterprise Java conference organized by NoFluffJustStuff that I criticized for assembling a 38-speaker roster than doesn't include a single female.

He never responded to my request to run his e-mail in full, but this quote sums it up:

We sought out a qualified speaker who was female. She is on your list. Unfortunately, she is in very high demand (as one would probably expect!) and in the end could not commit due to a scheduling conflict. Even with the conflict, we went the extra mile to accomodate her because she brought something different and refreshing to our target audience. Unfortunately, she just couldn't commit.

We're actively pushing bright girls out of professions like programming by reinforcing the idea that technological fields only appeal to one gender. The brain drain this causes has to be incredibly detrimental to this country's competitiveness, discouraging 51 percent of the population from pursuing these fields even as we rely more heavily on them in our economy.

Settlement Reached with Dave Winer

I've reached an agreement with Dave Winer regarding the Share Your OPML web application. I destroyed his original code and user data along with everything that was built from it and gave up my claim to a one-third stake in feeds.scripting.com. He gave up the claim that he's owed $5,000.

I originally hoped one of us would buy the other out and launch the application, but we found a much stronger basis for agreement in a mutual desire to stop working together as quickly as possible.

If Share Your OPML was a Java project I would've been heartsick to destroy it, but I coded the application in PHP. I've never written anything in PHP I didn't want to completely rewrite six months later.

Some people think I'm an asshat for taking this public, and I won't argue with that, but I don't have the resources to fight an intellectual property lawsuit against a millionaire. Winer knows this -- he's been a guest in my home -- and it's clear his attorney was acting from the same assumption throughout the settlement negotiation.

I decided the best way to avoid court was to show Winer what it would be like to sue a blogger.

I figured the publicity would be a stronger motivator to resolve the matter than anything I could say through an attorney. He's one of the most galvanizing figures in the technology industry. If he ever sues someone, the publication of the case's motions and depositions will put a blog in the Technorati Top 100. Since publishing the letter from Winer's attorney, my traffic's through the roof, I'm getting fan mail and I received three programming job offers.

I'm extremely grateful for the public support and the offers to contribute to a legal defense fund on my behalf, which I was hoping might lead to a Free Kevin-style sticker-based political movement.

Some programmers have said that I was foolish to write the app on the basis of a verbal agreement, and I'll concede that wholeheartedly. I won't even do the laundry now without something in writing.

I'm not going to close the book on this debacle with any Panglossian happy talk about how it all worked out for the best. This was a completely unnecessary sphincter-fusing legal dispute that could have been settled amicably months ago without benefit of counsel.

But I'm glad to stop pursuing an application so closely associated with OPML, because I don't share Winer's enthusiasm for the format.

I used to feel differently, but now that I've worked with it extensively, OPML's an underspecified, one-size-fits-all kludge that doesn't serve a purpose beyond the exchange of simple data. There's little need for an XML dialect to represent outlines. Any XML format is a hierachy of parent-child relationships that could be editable as an outline with a single addition: a collapsed attribute that's either true or false.

Developers who build on OPML will encounter a lot of odd data because the format has been extended in a non-standard way. An outline item's type attribute has a value that indicates the other attributes which might be present. No one knows how many different attributes are in use today, so if you tell users that your software "supports OPML," you're telling them you support arbitrary XML data that can't be checked against a document type definition.

OPML's also the only XML dialect I'm aware of that stuffs all character data inside attributes. Now that OPML's being turned into a weblog publishing format, outline items will have ginormous attribute values holding escaped HTML markup like this:

<outline text="&lt;img src="http://images.scripting.com/archiveScriptingCom/2006/03/16/chockfull.jpg" width="53" height="73" border="0" align="right" hspace="15" vspace="5" alt="A picture named chockfull.jpg"&gt;&lt;a href="http://scobleizer.wordpress.com/2006/03/16/the-new-a-list/"&gt;Scoble laments&lt;/a&gt; all the flamers in the thread on &lt;b style="color:black;background-color:#ffff66"&gt;Rogers Cadenhead's&lt;/b&gt; site, but isn't it obvious that the &lt;i&gt;purpose&lt;/i&gt; of his post was to get a flamewar going? What non-flamer is going to post in the middle of a festival like that one? I'm not as worried about it as Scoble is, because I've seen better flamewars and I know how they turn out. In a few days he's still going to have to try to resolve the matter with me, and the flamers will have gone on to some other trumped-up controversy. The days when you could fool any number of real people with a charade like this are long past. And people who use pseudonyms to call public figures schoolyard names are not really very serious or threatening. &lt;a href="http://allied.blogspot.com/2006/03/lynch-mob-security.html"&gt;Jeneane Sessum&lt;/a&gt; is right in saying it's extreme to call this a lynch mob. It's just a bunch of &lt;a href="http://www.cadenhead.org/workbench/news/2881/letter-dave-winers-attorney#46458"&gt;anonymous comments&lt;/a&gt; on a snarky blog post. Big deal. Not.&nbsp;&lt;a href="http://www.scripting.com/2006/03/16.html#When:11:21:10PM"&gt;" created="Tue, 16 March 2006 11:21:10 GMT"/>

I'd be amazed if XML parsers can handle attribute values of any length, but that's what's being done today with OPML.

Now that an agreement has been reached, Winer doesn't have to share Share Your OPML and I can flee in terror before any border skirmishes lead to another XML specification war.

Maybe this is the best of all possible worlds.

Update: Winer appears to have launched a new PHP-based implementation of Share Your OPML with Dan MacTough.