AUTHOR'S MAIN SITE   > > > > >   TABLE OF CONTENTS for Open Office database tutorials.

Open Office Tutorials
General Theory of CSV Files

You may find that the database being shipped with OpenOffice (ver.2 and higher) delights you as much as it has me. This page tries to help you use it.

Forget anything you may have heard about Adabas, which came with Star Office, the commercial version of Open Office 1. The current Open Office's database, "Base", aka "ooBase", is unrelated. And remember that Open Office, including ooBase, is free! But don't let that fool you. And it's not new. Big organizations, government and civilian, are adopting it as their standard office suite... and saving million$, but still Getting The Job Done.

There's more about ooBase in the main index to this material.

This page is "browser friendly". Make your browser window as wide as you want it. The text will flow nicely for you. It is easier to read in a narrow window. With most browsers, pressing plus, minus or zero while the control key (ctrl) is held down will change the texts size. (Enlarge, reduce, restore to default, respectively.) (This is more fully explained, and there's another tip, at my Power Browsing page.)

Page contents © TK Boyd, Sheepdog Software ®, 3/06.



CSV Files: Pages at this site _______________

This is one of at least four pages discussing CSV files in my ooBase tutorials. In this page, I attempt to convey the general principles of CSV files.

Other pages....



CSV Files: Theory _______________

CSV stands for "Comma Separated Values".

If I have the following little table of four rows, each with three values, in something, anything....

     2   abc  Fruit cakes
 1,000   b52  Donuts
  5.5    xyz  Apple pies
   -6    lmn  Partridges, in pear trees

... then how it is saved will depend on what program you are using to display the data. We won't normally CARE how it is saved. We'll just be happy that we can call it up whenever we need it.

However, if you want to move that data from one application to another, the "fun" can be tedious.

The CSV format is one (partial) solution, invented in the early days of computing.

The data above can be put in CSV- crudely- as follows:

"2","abc","Fruit cakes"
"1,000","b52","Donuts"
"5.5","xyz","Apple pies"
"-6","lmn","Partridges, in pear trees"

Each item has been enclosed in quotes.
Each item is separated from the next by a comma.

Once upon a time, the things in the first column were probably recognized as numbers. You might, for instance, have added them up. CSV moves us to "the simple life". The first item, the "2", has become just a "squiggle". It is no longer "half of four", or anything else "significant" like that. Inside the computer, there is one number, the code for the "2" shape. Before the data was converted to CSV, that "half of four" could be stored many, many ways. The second item, "abc", is just three numbers inside the computer: The codes for the "a" squiggle, the "b" squiggle and the "c" squiggle. (They were probably "just" those codes previously. But they might, before the conversion to CSV, have been augmented by codes to say what font they should be presented in, etc, etc.)

CSV is very crude. That's its limitation and its glory. Being crude, interpreting it isn't hard. Being crude, you can't do much with it. But what you can do, you can usually do reliably.

Why the commas? They tell you where one item ends and the next begins.

Why the quote marks? Without them, "1,000" and "Partridges, in pear trees" would both look like two items. Commas inside quote marks are "hidden".

What have you missed?

If I read the CSV file given above, I know I am looking at four records, each containing three fields. The file isn't two records of six fields, or anything else adding up to twelve items.

How Do I Know?

Although you can't, directly, see them, after every three items, there are "a line ends here" codes in the file. One line- one record.

Fine points....

Fine Point 1:

What if there are gaps in the table?

Say our table was....

2     abc  Fruit cakes
1,000 b52  
5.5   xyz  Apple pies
-6         Partridges, in pear trees

(This is different from the one above in that there's no third item in the second row, nor second item in the last row.)

The CSV version of the above would be....

"2","abc","Fruit cakes"
"1,000","b52",""
"5.5","xyz","Apple pies"
"-6","","Partridges, in pear trees"

... The "missing" items are represented by the double quotes. The "hold the places" in the data structure.

Fine Point 2:

CSV data is not always separated by commas. For instance, we might use semi colons instead of commas, in which case our data (first, gapless version) might appear as....

"2";"abc";"Fruit cakes"
"1,000";"b52";"Donuts"
"5.5";"xyz";"Apple pies"
"-6";"lmn";"Partridges, in pear trees"

Note that the commas in "1,000" and in "Partridges, in pear trees" remain as commas.

The virtue of using some character other than a comma is that if you KNOW the other character will not appear in your data, you can dispense with the quote marks. The following would work fine, if your data would never have a semi-colon in it...

2;abc;Fruit cakes
1,000;b52;Donuts
5.5;xyz;Apple pies
-6;lmn;Partridges, in pear trees

Tab characters, which line the "line ends" characters have an effect on what you see, but can't themselves be seen, are sometimes used in place of the comma. (Many data sets which you might want to convert to CSV will be guaranteed not to have tabs in them by the nature of the program holding the data.)

Fine Point 3:

What if your data has quote marks in it?

Ah. The answer to this is tedious. Intelligent question, well done. Take my word for it: there ARE answers, but you don't want me to go into them.

Fine point 4:

"I've seen CSV files before", you say, "and not all of the fields' data were enclosed in quotes."

The data we've been using might be turned into the following....

2,abc,"Fruit cakes"
1000,b52,"Donuts"
5.5,xyz,"Apple pies
-6,lmn,"Partridges, in pear trees"

Note that the comma in "1,000" was removed by the program creating this CSV.

Whether all or just some fields get quote marks.. or some other delimiter... can be automatic or user controlled. It depends how clever your CSV maker is. What the program that will read the CSV does varies, and may not match what the writing program expected, but in general most programs do a good job both of exporting CSV and of importing it. (If the program "does" CSV at all!)



We know what a CSV file is. What is it good for? _______________

The point of a CSV file is that it can be read into another program, and a table similar to the original is created. CSV is the "lingua franca" for moving data between applications. However primative something is, it will deal with CSV unless the authors are trying to lock others out of their application.

Some programs have only limited CSV support and cannot, for instance, recognize that the items in our first column are numbers. Even after import, it will not be possible to do arithmetic with those "numbers". But most CSV capable programs DO at least try to recognize numbers. Sometimes at the start of the import process you are given a chance to say, by hand, "try to treat the items in the following fields as numbers".

As you know I am a fan of Open Office, you will have guessed that I will tell you that Open Office has very extensive, very well put together CSV import facilities.



Where now? _______________

As I said at the beginning of the page: This is one of at least four pages discussing CSV files in my ooBase tutorials. In this page, the general defintion and features of CSV files were given.

Other pages....

... and of course there's also the site's main menu!



Editorial Philosophy

I dislike 'fancy' websites with more concern for a flashy appearance than for good content. For a pretty picture, I can go to an art gallery. Of course, an attractive site WITH content deserves praise... as long as that pretty face doesn't cost download time. In any case....

I am trying to present this material in a format which makes it easy for you to USE it. There are two aspects to that: The way it is split up, and the way it is posted. See the main index to this material for more information about the way it is split up, and the way it is posted.


Ad from page's editor: Yes.. I do enjoy compiling these things for you... I hope they are helpful. However.. this doesn't pay my bills!!! If you find this stuff useful, (and you run an MS-DOS or Windows PC) please visit my freeware and shareware page, download something, and circulate it for me? Links on your page to this page would also be appreciated!

PLEASE >>> Click here to visit editor's Sheepdog Software (tm) freeware, shareware pages <<< PLEASE


If you liked this ooBase tutorial, see the main index for information other help from the same author.

Editor's email address. Suggestions welcomed!     - - -    Want a site hosted, or email? I like 1&1's services.




Valid HTML 4.01 Transitional Page tested for compliance with INDUSTRY (not MS-only) standards, using the free, publicly accessible validator at validator.w3.org


One last bit of advice: Be sure you know all you need to about spyware.

. . . . . P a g e . . . E n d s . . . . .