From: Steven Marzuola Sent: 30-Sep-02 6:18 AM To: dejavu-l@yahoogroups.com Subject: Re: Fw: [dejavu-l] sql question --- In dejavu-l@y..., "Danilo Nogueira" wrote: > I finally found the time to test Alessandra's suggestion. > Seems it did not work. Apparently, I got a list of all segments > that did not have four words - meaning I also got the five > worders etc. > > Trying to come to terms with Oliver Richard's SQL Guide for > DV users and that, I think, may lead me into the Wonderful > World of SQL, but I must confess I have not been able to make > head and/or tail thereof so far. Will keep trying. Meanwhile, > any help will be welcome. I think I must buy and eat a book > on SQL (and another on HTML and another on VBA and another > on %$#@!!! too. And that on translator's day, too!) Danilo, You won't need a book on VB, HTML, or SQL. I have never used any other SQL reference other than Oliver's excellent introduction. Exactly what do you want to do with SQL? If you still want to find segments in your MDB that have 4 words or less, here's how I would build the appropriate SQL command. (I tested it.) First, maybe it would be helpful to say what SQL does. DV SQL is actually a subset of the very complicated and powerful language SQL. It's implemented in two parts in DV: SQL Select and SQL Execute. I'm not touching SQL Execute in this description. When you use SQL Select, you give it a logical expression, and DV in effect asks a question, about every pair in the database. The result is to display all pairs that meet the condition (very much like the Filter command). It doesn't modify, delete, change, or update. It just (temporarily) hides the pairs in the database that don't meet the condition. By the way, I'm referring generically to "database", but you can run SQL commands on a DV project as well as in Database Maintenance and Terminology Maintenance. For example, if you want to find all pairs in the MDB that contain the word "Brasil" in the source text, then you might try this expression: SourceText like 'Brasil' In words: "Is the source text something like 'Brazil'?" If the answer is yes, then display the row. If no, then hide the row. This is the basis of a one type of logical expression. It's like algebra: There is a left side of the expression ("SourceText"), an operator ("like"), and a right side "'Brasil'"). In this particular case I have given you an expression that won't work. It will find a source segment containing only the word "Brasil" (or "BRASIL", or "BrAsIl" - SQL is not case sensitive) But any segment with additional characters (such as the words, "here in Brasil") would not match. So let's add the "wildcard" character * (star, asterisk). This character acts the same as it does in MS Word, Find, Use Wildcards -- it matches all possible characters. SourceText like '*Brasil*' This SQL Select expression (which you can paste into the box in the SQL Select dialog window) will find segments that say "here in Brasil", "Brasil 2 Germany 0", as well as the single word "Brasil". Next, to define a word calls for some more SQL constructions. [abc] means match any of the characters a, b, or c. (or A, B, or C. SQL is case-insensitive. Did I mention that?) So, if you wanted to find segments containing both English and Portuguese spellings of Brasil: SourceText like '*Bra[sz]il*' You can find the expressions that do NOT contain either spelling of 'Brasil', using a NOT: NOT (SourceText like '*Bra[sz]il*') [a-z] in a 'like' expression will match any character from a to z (or A-Z). This expression will match text in the database like "Note a" or "Note M", but will not match "Note 2" or "Note:" SourceText like '*Note [a-z]*' Now for more special characters: [!x] means do NOT match the letter x. [!a-z] means do NOT match any letter. Putting this all together, the following expression finds all source text that contains at least two words: SourceText like '*[a-z]*[!a-z]*[a-z]*' This looks complicated, so let me rephrase it: 1. look for * 2. followed by [a-z] 3. followed by * 4. followed by [!a-z] 5. followed by * 6. followed by [a-z] 7. followed by * Restated: 1. look for any character 2. followed by a letter 3. followed by possibly any character (maybe none) 4. followed by a character that is not a letter from a-z 5. followed by any character (maybe none 6. followed by a letter 7. followed by any character In other words, that expression will match any source text that has at least two letters, that are separated by a non-letter character. Other characters can be added and still not affect the match. These source "sentences" would all match: a b c, d Brasil wins The rain in Spain These source text would not match: aa (no non-letter character) aa, (the non-letter character is not followed by a letter) 1234a (only one letter character) Next, think about putting together a train. You add a locomotive, and more cars at the end until it's as long as you want. Here's a sequence that will find segments of 4 words or more: SourceText like '*[a-z]*[!a-z]*[a-z]*[!a-z]*[a-z]*[!a-z]*[a-z]*' But wait: your goal was to find segments with FEWER than 4 words! Not a problem. Place a NOT before the whole thing. This reverses the original question. NOT (SourceText like '*[a-z]*[!a-z]*[a-z]*[!a-z]*[a-z]*[!a-z]*[a-z]*') Any questions? Steven ------------------------ Yahoo! Groups Sponsor ---------------------~--> Looking for a more powerful website? Try GeoCities for $8.95 per month. Register your domain name (http://your-name.com). More storage! No ads! http://geocities.yahoo.com/ps/info http://us.click.yahoo.com/aHOo4D/KJoEAA/MVfIAA/dkFolB/TM ---------------------------------------------------------------------~-> Send messages for the list to dejavu-l@yahoogroups.com Unsubscribe with a message to dejavu-l-unsubscribe@yahoogroups.com Search the archive at http://groups.yahoo.com/group/dejavu-l/messages Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/