Friday, September 26, 2014

The Basic Idea of Lossy Compression

.
Since around 1980, when the Internet followed closely on the heels of personal computers, we live in a world in which music and images are sent back and forth in highly compressed form.

How is a text file stored?

A text file basically consists of characters.  Long ago, the computer folks got together and decided that each character on a typewriter would be represented by a number.  Here are a few of them:

“A”=65,
“B”=66,
“C”=67,
“D”=68,
“E”=69,
“F”=70,
“G”=71,
“H”=72,
“I”=73,
“J”=74,
“K”=75,
“L”=76,
“M”=77,
“N”=78,
“O”=79,
“P”=80,


... and
so
on ...


“X”=88,
“Y”=89,
“Z”=90,







Which is all the capital letters.  After this come a few odd things, namely "[", "\" and so on.  Presently, we come to the lower-case letters:

“a”=97,
“b”=”98”,
...



...
“x”=120,
“y”=121,
“z”=122,
“{“=123,
“|”=124,
...





and so on; you get the idea.  The entire correspondence can be found here.  So if you made a text file that contained the single word zab, it would contain essentially the numbers 122, 97, 98, appropriately organized so that there is absolutely no mistaking it for 1269798, so don't worry.  Modern text files are actually word-processor files, so they contain, in addition, italics information, boldface information, etc, etc, so they're bigger.  (They also contain a lot of information of interest mainly to Microsoft Corporation, which adds to their size a tiny bit, e.g. This file was created in Word; aren't we awesome?)

Lossless Compression

Even text files can be made smaller by studying them carefully.  This is what happens when you use something like WinZip to compress your file.  Such utilities use tricks to represent the most common letter in your text by very short codes, the next most common letter using a slightly longer code, and so on.  Working backwards, your file can be expanded to be exactly the way it was.  This sort of thing is called lossless compression,  because the document can be reconstructed exactly at the receiving end.  This procedure, called encoding, depends on creating very compact lists of numbers to represent lists of number that are not so compact.

Make no mistake: the procedure is highly complex, and consists of multiple methods, each of which makes the resulting file smaller and smaller.

GIF Files

Compuserve, a computer company, invented a method for compressing certain types of pictures using a lossless compression; this method is still widely used today.  The compressed files have the extension "gif".  Here's essentially how it's done.

(1) A picture is first processed so that it has only 128 colors.  Many pictures look okay when the number of colors is reduced; for instance a chart, or diagram will probably only contain 5 or 6 colors to begin with.  The colors are numbered 0, 1, 2, and so on, to 127.  This is called the palette for the picture.  (It's a great feature that each picture's palette is individualized.  But still, the restriction of the number of colors is a sacrifice.)

(2) Next, the picture is divided into little 8x8 pixel blocks, and each block is encoded individually.  The color number is assigned to each of the 64 pixels.  Now it boils down to compressing these 64 numbers as much as possible.  Suppose the rectangle of numbers looks like this:

11122233
11223333
12233232
12333343
13344434
...
and so on.

(3) At this point, a clever trick called run-length encoding is used.  The numbers are stretched out in a single long line, starting at the top left, and going in a diagonal pattern.  I'm not sure, but it could look something like this:
 Following the arrows, you get the number list:
1,1,1,1,1,1,2,2,2,1,2,2,2,2,1,2,3,3,3,3,...
As you can see, this method of scanning the picture optimizes the likelihood of long strings of repeated colors.  Such a list of numbers would be encoded something like:
1(6),2(3),1,2(4),1,3(4)
and so on.  This means: "six 1's, followed by three 2's, followed by a single 1, then four 2's ..."

 There is obviously great potential to get a very small file indeed, if there are large stretches all of a single color.  If you ever had to store a GIF file of mostly white with a few black dots, you would have found it miniscule!

Lossy Compression of a list of numbers

In "lossy" compression (which means there is some loss; a typical computer type term...) the picture is approximated.  It is such a good approximation that nobody really minds.

First, let's see how a single row of 16 numbers can be approximated.  (Remember, this is a very rough explanation for laymen.)

Let's illustrate with a strip from a black-and-white photo.  The actual method is 2-dimensional, and not 1-dimensional, but the principles are very similar.  Suppose the strip of image resulted in

1
3
5
8
11
15
19
24
30
36
43
51
59
68
77
87

Remember, these number are supposed to represent colors, which usually vary smoothly in a photograph, for instance.  Since these are single numbers, rather than triples, they represent a grayscale image.  (Each pixel of a color image is represented by three numbers.  But you can easily imagine that a color picture can be split into three grayscale pictures.  If I show you how a grayscale image can be stored lossily, you can easily see how the full color picture is stored.)  Just to get a feel for what these sixteen numbers might mean, here is a chart, where each number is represented by the height of a bar, instead of a pixel:
Now, each number is actually stored as bits; that is, a number consisting of a row of zeroes and ones only.  Just keep this in mind.

First, we store the average of the entire list of numbers, which is ... 34, to the nearest integer.  So we put that in as our first number, in storing the “picture”.

34.

Now, we subtract 34 from each number, which gives us a new list:
-33, -31, -29, -26, -23, -19, -15, -10, -4, 2, 9, 17, 25, 34, 43, 53.

Let’s average the left 8 numbers, and the right 8 numbers:  we get -23 (roughly), and 23 (roughly).  We store these, as well.  So now, we have:
34, -23, 23.

Now, we subtract a -23 from the first 8 numbers, and 23 from the second 8 numbers.  We get 16 much smaller numbers:
-10, -8, -6, -3, 0, 4, 8, 13, -26, -20, -13, -5, 3, 12, 21, 31

We average the numbers in groups of 4:  we get (roughly) -7, 6, -16, and 17.  We store these, too:
34, -23, 23, -7, 6, -16, and 17.

We subtract -7 from the first four numbers, 6 from the next four, and so on.  We end up with even smaller numbers than before:

-3, -1, 1, 4, -6, -2, 2, 7, -10, -4, 3, 11, -14, -5, 4, 14

At this stage, we can simply stop.  If we restore the entire list of 16 numbers, pretending that the list contained all zeros at this stage, you know we're only going to be off by, well, 14, at the most!  We could do another step, and be off by 7 at the most.

The fun part would be to compare the "picture" we assemble using these numbers with the original list.

Start with all zeros, and add -7 to the first four, 6 to the next four, -16 to the next four, and 17 to the last four, we get
-7, -7, -7, -7,  6, 6, 6, 6, -16, -16, -16, -16, -16, 17, 17, 17, 17.

Now we add -23 to the first 8, and 23 to the last eight:
-30, -30, -30, -30, -17, -17, -17, -17, 7, 7, 7, 7, 40, 40, 40, 40,

Finally we add 34 to every number:
4, 4, 4, 4, 17, 17, 17, 17, 41, 41, 41, 41, 74, 74, 74, 74

Compare with the original file:
1, 3, 5, 8, 11, 15, 19, 24, 30, 36, 43, 51, 59, 68, 77, 87


As you can see, it is horrible!  I was feeling pretty depressed, thinking that this was a terrible example to show you, but this file of 16 numbers is just too small for any lossy compression to be illustrated.  I started again with 32 numbers, got an average, got two averages, got four averages, and got 8 averages, and threw out everything after that.  (In other words, I pretended that after I took the last 8 averages, that I was left with all zeroes.)

Here is a graph of the original file (in blue), and the compressed file (in red).  As you can see, the compressed data tends to look blocky, just as you've noticed JPEG files do.

Finally, I explain why this procedure results in even greater space savings than you might imagine.

Suppose the color intensity of the original picture varies from a pixel that's pure black (0) to one that is pure white (255).  All the pixels must have values between these two extremes.  The average, too, must lie between these two numbers.  These sorts of numbers take 8 bits (of zeros and oness) to express in binary.

But, usually, the average is about 128.  If a picture is such that the average value of its pixels is about 128, it will compress well, otherwise it will compress poorly.

Once you subtract the average out, the remaining list of numbers must be typically between -128 and 127, which can be represented by 8 bits, again.

Once the second set of averages are subtracted out, the numbers will lie between -64 and 64, which only need 7 bits!  The next set of averages will lie between -32 and 32, which can be represented by 6 bits!  Once you get to the point where you think that the itty bitty leftovers don't need to be stored, you don't use any additional space at all.

If your picture had 32 pixels, uncompressed, you would need 32×8 = 256 bits to store it.

Compressed using this primitive method described above, the first average is 8 bits, the next two averages are 7+7=14 bits;
the next four averages are 6+6+6+6=24 bits, and the next eight averages are 5+5+5+5+5+5+5+5=40 bits.  So the whole package would only take up 126 bits.  Additionally, it is possible to cheat by only storing those averages approximately!  It is done using binary digits, but it is comparable to storing highly-rounded versions of the averages, which take even less space.  It is done cautiously, based on how the eye perceives the image, and whether a minor misrepresentation of an average color of a block, or sub-block, makes a big difference to the appearance of the image.

With larger files, the savings are enormously greater; with photographs, if they vary smoothly across the picture, the savings will be great, but if there is a lot of detail, you just can't throw out the last remaining residues.  More and more averages need to be retained, which all contribute to the size of the file.

The actual method, though basically similar to what was shown above, uses sines and cosines and trigonometry to make the calculations easier.  This article in Wikipedia might make more sense, now that you have read this introduction.  Even if you just look at some of the pictures, you could get a handle on how it's done, completely disregarding the mathematics.

Arch

Friday, September 19, 2014

Is it imperative to raise more money than the Republicans?

.
Certain elements of the Democratic Party, and Liberals generally, have bought into the belief that elections can only be won, and GOP propaganda can only be countered, by engaging in an expensive media war.

Of course, nobody really knows, since for more than a decade we haven't looked at any alternatives.  Move On, and various machines that have been set up to work on behalf of liberals, are constantly asking people on their mailing lists to "Chip in a few dollars" to do thus and so.  It's a media campaign in North Carolina, or a media campaign in California.

One can't help but wonder where all this money actually ends up.  In the hands of the GOP, right?  Because most owners of the media are conservatives.  Unless I'm very much mistaken, the offices and the media owners of even liberal media are actually owned by conservatives.  In any case, media companies are hungry for cash (as are we all, I suppose) and tend to encourage this trend to buy media time and resources to fight propaganda battles electronically.

The public, it appears, is getting less and less intelligent, and more and more gullible, and it seems to take a lot of money to persuade anyone of facts that are self-evident.  Fox news has only to hint that there's something wrong with Barack Obama, such as that he has something absolutely improbable, such as epileptic fits, and the Liberal Media immediately goes into a panic overdrive to persuade everyone that it is not so.

Good government has to be deserved.  If all our neighbors insist on being idiots, insist on getting inferior services, insist that, for instance, dietary information on food products are not needed, that music is not needed in schools, that we don't need clean air, and that we don't need clean water, that we only need a lot of energy at any cost, and that only dirty energy such as gasoline and coal is any use, and that we don't need to protect endangered species like fish, wildlife and whales, then there's little we can do.  It is a theorem that stupidity can only increase!  It's a form of entropy.

But there's actually reason to believe that a lot of people actually do want clean energy, they do want inexpensive health care, they do want safe job, and information at the grocery store, and honest government.  They're just tired of arguing with illogical conservatives!  They're just waiting for Election Day.

Getting people to the polls is most definitely a priority.  Forget the media campaigns.  It is important for people who stand for the things that we stand for to reveal themselves.  It isn't important, in my humble opinion, to raise a lot of money for massive media campaigns.  Let's just say no to fundraising.

Wednesday, September 10, 2014

My Book List

(These days, everyone seems to start with “So,” so here goes.)

So, people are nominating each other furiously to publish their ten most favorite books.  The phrasing seems to be different in each case; sometimes it is the ten books that stayed with them the most; sometimes it is the ten books to which they return most often, othertimes it’s something else.  Perhaps for some people, picking ten books out of all those you’ve read is easy.  But that’s like picking ten of your favorite people.  Like picking ten people to thank if you win the Academy Awards.  It’s silly; you just can’t do it!  I certainly couldn’t.

I read like a fury until the age of, oh, forty, I’d say.  Then I started writing, and began to read my own stuffwriting, I mean which sounds silly, but, well, that’s what I do.

Some of the books I’ve read, I’d much rather nobody knew about.  Other books I’ve read won’t mean much to anyone unless they’re in my field, or share my interests.  Yet other books I’ve read just don’t bear reading today, simply because we don’t talk like that anymore, and the stuff is obnoxious.  I just picked up one of them the other day, and some passages were appalling, for how badly they were written, or how ignorant the author comes across as being.  Let’s face it: I read a lot of crap.

It’s that fact, more than anything, that makes me determined to join the hordes of those who want to air their dirty reading laundry.  But I have to editorialize and comment on many of the books.

G. A. Henty: The Cat of Bubastes

This has to stand for about seven books by Henty that I read, including the gruesome “With Cortez in Mexico,” which began my political awakening.

Arthur Conan-Doyle: A Study in Scarlet

There’s not much to say about this, but it stands for nearly fifteen books I read as a teenager, after my Dad laid a Sherlock Holmes treasury on me.  Great style.

Leslie Charteris: The Lady or the Tiger

See what I mean?  Again, this one must stand for a ton of Saint books I read.

Louisa May Alcott: Little Men

(I read “Little Women” too, but off the record.)  I loved this book, and my writing was influenced greatly by Ms. Alcott’s style, which, I know, tends to the sentimental.

Edgar Rice Burroughs: Son of Tarzan

One of the most romantic books, especially for a sixteen-year-old, who doesn’t know much about anything.

Gertrude Norman: Letters of Composers

This is an anthology, and is probably one of the books that influenced me most.

C. P. Snow: Variety of Men

It would have been more impressive if I had said Two Cultures, but I started that one when I was, like, seventeen, and quickly put it down, and read this one instead.

Herbert Goldstein: Classical Mechanics

I wasn’t sure whether textbooks were allowed on this list, but I see a few in other people’s lists, so why not?  This is a brilliant book, and I love this gentleman dearly (through his books.  I have never met him).

Laura Ingalls Wilder: The Long Winter

I read these a lot later in life than most people, but I only learned of them when I was an adult.  Describes the Pioneer experience to the rest of us.

Laura Adams: Seeds of Fire

Laura Adams is a pseudonym of Karin Kallmaker.

Donald Johanson and Maitland Edey: Lucy, The beginnings of Humankind

I can’t believe people have forgotten all about this book.

Terry Pratchett: Wyrd Sisters

You’ll never know what humor can be unless you give this one a try.

I think I’ve gone a couple over.  I’ve also left out a ton of books by Marion Zimmer Bradley, starting with Darkover, and ending with Avalon, and the tragically neglected Firebrand.  I have also read almost everything Anne McCaffrey has written, as well as Arthur C. Clarke, James P. Hogan, and numerous science fiction writers.  And I’ve left out the crazy books by Douglas Adams: the Hitchhiker set and Dirk Gently, as well as books by Agatha Christie, J. R. R. Tolkien, and David Eddings, especially the Belgariad series of the latter.  I’ve also left out the lovely novels by Jessica Duchen: Rites of Spring, and Muriel Barbary: The Elegance of the Hedgehog.  But I’m more concerned with conveying the variety of books I have read, than in listing the ones most folks are likely to recognize.

Oops, forgot Harry Potter!  Also forgot Susan Haley's Buffalo Jump, and Rebecca West's The Fountain Overflows, which Susan brought to my attention.  This is going to be bad; I'm going to be screwing around with this list every time I remember another book.

Guess what: I also forgot ... wait ... I've forgotten.

[Added later:
If any of you are Anne of Green Gables fans, yes, I have read the books, and I love them :)  I first read them just a few years ago, at an age when I was a lot more difficult to bowl over!

Piers Anthony wrote a number of series, and I have read several of them.  Unfortunately, they're not easy to get into.

I have read the Hardy Boys books, and I must say that I agree with my wife that they (the Hardy boys) were fatheads.  Nancy Drew was a lot more likeable.

Few of you could possibly remember a series about a village priest in Italy, and the communist mayor of the town: Don Camillo, and Peppone.  These characters were created by Giovanni Guareschi, and serialized in a magazine, and published only posthumously as novels.

A couple of books by Morris West got read by me, as did a number of James Bond books, a couple of books by Alistair McLean (Ice Station Zebra, Where Eagles Dare), but I prefer books without too much mayhem in them.  Dan Brown's Da Vinci Code comes close to my tolerance level.

A book that impressed me deeply was Isaac Asimov's Guide to the Bible, well worth reading by anyone who wants to demythologize their understanding of so-called Biblical times.

The Physics and Chemistry of Life, an anthology by Scientific American, which the famous journal kept hidden from my eyes for more than 20 years, is an amazing tour of the mechanics of the life processes.  Anyone with a little basic chemistry can understand it.  It is now available online from Cengage.

Feynman, Leighton and Sands's The Feynman Lectures on Physics was actually an enjoyable read, and taught me a lot of mathematics.  Feynman had the same power to explain as Leonard Bernstein, using beautiful, conversational language.  (It was actually a transcript of spoken lectures.)

I loved reading my daughter's copies of Tamora Pierce's stories, especially the Alanna books, and the Keladry books!

A.

Tuesday, September 2, 2014

Yet another writer who knows what's wrong with schools

.
A recent post on Slate tries to focus a little more attention on a person who has a lot more influence on the quality of a school than teachers: The Principal.

Well, okay, sure.  But this looks like one more fad that won't provide the answers.  Just remember: there is no single culprit in the crime that is American Education.  But certainly, a good principal can do more to influence the quality of a school than all the teachers combined.

What do you think an ambitious teacher getting his or her first appointment in a school wants to do?  Get into the administration.  It's more money, and a lot more opportunity to get away from the drudgery of classroom work, and a lot more power.  So, generally speaking, you can expect a typical principal to be in it for the money.  Teachers who, after many years, are still in the classroom, are probably there because they like to teach, or they prefer to deal with kids than with their parents.  Among teachers and principals alike, parents have a reputation for whining and complaining about their good-for-nothing kids.  Good parents will deal with their kids head on.  Bad parents want the school to deal with their kids.  (If you have kids in school, do not be offended; I'm making broad generalizations, and you may not fall into either category.  If you see yourself in these words, there's nothing to stop you adjusting your behavior: your kids are the legacy you leave the world, not the school's legacy.)

The article correctly identifies the biggest single factor in the quality of the school: whether the school is in a poverty-stricken area, or whether it is in an affluent area.  But the author continues to point at principals as the ones to watch, rather than the economics of the area.  Poor people in America have very little, and education is one of the principal tragedies in the experience of the poor.

The turnover rate of teachers in poor schools is high.  But guess what: the turnover rate of principals in poor school districts is also high. Teachers and principals are not more mercenary than anybody else.  After a while, the salary provided by a school in a marginal school district will no longer serve to support the family of an educated man or woman.  (Pay attention: the wealthy are greater consumers of everything than the Middle Class, and members of the Middle Class are greater consumers of all sorts of resources than the indigent.  I keep saying that the wealthy are in a better position to avail themselves of resources such as airports and harbors and highways to remote resorts (at least partly built at public expense) than the masses.  If a tollbooth were set up, say, on a highway through Grand Teton State Park, and every vehicle was required to report the entire gross income of its occupants for the previous year, I daresay it would be far greater than that of a typical traveler on, say, I 80.  The wealthy use more of the people's resources, which is why they should pay more taxes.)

The second major factor contributing to the ineffectiveness of American education is simply this: because of the materialistic nature of our society, education is a means to the end of a higher paycheck.  No one is valued simply for their erudition.  The educated person is not held in high regard unless he or she is very well employed.

Should an educated person be held in higher regard than one who is not, other things being equal?  In America, no, because poverty can be an obstacle to education.  In a country where education is free, yes; a person who scorns the opportunity to learn has to earn our pity.  It is fashionable for demagogues to profess scorn for education, but it is education that could lead us out of this cycle of viciousness and destruction, this illogic that passes for cleverness in today's society.  Public spokespersons everywhere: TV anchors, political leaders, heads of corporations, all arouse embarrassment and pity in our hearts.  We can't just give up, of course, but it certainly would be a lot easier if our fellow-citizens really knew what they were talking about.

No, principals alone can't fix education, but they certainly have the power to push in the right direction.  Teachers can't fix education, and it does not help to start witch-hunts to discover the 'bad' ones.  But uninterested teachers must certainly be discouraged from taking up the profession.  Politicians alone cannot fix education, but they can certainly make the problems of education a million times worse by indulging their instincts for opportunism.  Parents certainly can help improve education, just by educating their own children better.  But parents on their own can't fix the entire problem.  I can't see a solution, but making Principals the scapegoats is certainly not going to fix the problem.  To be honest, the writer of the piece in question was not doing that; it was just a suggestion that it was as well to ensure that we appoint principals of the right quality.  We're all desperate for solutions, and this is probably a reasonable response to the situation.

Arch, exhausted

Final Jeopardy

Final Jeopardy
"Think" by Merv Griffin

The Classical Music Archives

The Classical Music Archives
One of the oldest music file depositories on the Web

Strongbad!

Strongbad!
A weekly cartoon clip, for all superhero wannabes, and the gals who love them.

My Blog List

Followers