Thursday, 26 April 2012

Let's go data digging

With the nuts and bolts of our basic Trading database now in place we can record individual trades and examine them in detail and in summary on one useful form. I personally think that for the 'man hours' invested progress so far has been very good. It's now time to accelerate that a little and start to draw some interesting facts from our database.

Welcome to the enthralling world of database queries. Those totally new to this kind of thing might already be thinking what an accomplished programme Access is; but in truth very few professional db programmers would use it as the basis to build a commercial application. Such applications are much more likely to be developed using a pro DB like SQL Server or Oracle; both of which are comlplicated, capable and make Access look very much the younger, weaker sibling. Nevertheless the underlying principles of data separation (normalisation to give it its proper title), relationships and the basic structure of the DB are very similar. The big strength of Access in simple applications like ours lies in its relative simplicty. Under the hood, all three programmes use SQL - which stands for Structured Query Language - normally pronounced 'see kwell' in polite conversation.

Luckily for us Access enables us to use SQL almost completely graphically, so with no further ado let's create a basic Query.... On the main ribbon select Create / Query Design. You are then presented with the following screen:


The first query we will create will be a basic summary query, using data from all three of our tables, so click on 'Add' for each table and then 'Close' when finished:


Notice how the relationships we created last time are shown. The simple way to get what we want is to follow the structure of the Trades table. So double click on TradeID and that field will magically appear in the leftmost column of the designer. Do the same for TDate. When we get to sportID we want the sport to display rather than the ID, so doucle click on 'Sport' from the Sports table. Then it's back to Trades for the Event field, and then over to the Strategy table for the strat name. The rest of the fields can be entered from the Trades table, but for this exercise we don't need the Description field. You should have something like this when finished...


That's it! our first query is now created and ready to run. So go to the top left of the ribbon and click on the large exclamation mark (! for non English speakers) and you'll see something like this:


If this look very similar to the Trades table - that's because effectively that's what we've done - replicated it! Click 'Save' and you'll be prompted to come up with a name to replace the somewhat limiting 'Query 1' that it's called at the moment. Fashions change with respect to naming DB objects, but I always like to prefix my query names with the letters 'qry' so I'll call this one 'qryTradesBase'. The logic of me using the word base will become apparent in subsequent outings.

So all that effort has really just succeeded in duplicating something we've already done! So far, so useless!

Let's enhance it a bit by enabling us to select a range of dates to filter the data. Return to design view and place your cursor in the 'Criteria' box in the TDates column and type the following: Between [Start date:] and [End date:] - notice how logical the language is... the words in the brackets are prompts for the user, so go ahead and click 'Run' - and enter a start date when asked and an end date (you might need to change the dates of some of your dummy records if, like me, all your entries so far have been in April 2012! Put a couple in March and run the query with 1/4/12 and 30/4/12 as the parameters and you'll see only April trades.)

Hold on, you say, if I want all my trades to appear I don't really want to have to remember the date of my first trade and enter that every time.... what a pain that would be! Fortunately you don't have to.... take your cursor to the box immediately underneath the criteria you've just entered, and type the following: [Start date:] Is Null And [End date:] Is Null.

Run the query again, but this time just click OK for both the start and end dates. All your trades will now be displayed. For this to work it is crucial that the prompts are spelt and punctuated exactly the same way in both boxes - otherwise Access will assume you wish to filter by three or four parameters and all kinds of weird and wonderful results might ensue. This one little tweak adds an enormous amount of utility I think you'll agree. By entering at most two dates you can now see your trades by all time, by the month, the week or even the day (just enter the same date for Start and End if you just want one day) - simple as that!

We'll just do one last little tweak to this query for now - namely to decide whether an individual trade was a winner or not. The test for this, of course, is very simple! A positive PL entry means 'Yes' and a negative PL entry means 'No'. However, all we want to do is to record a number... 1 for Yes, 0 for No. Here's how to do it...Go to the last column in the designer, and in the field name type: 'Winner:' (with the colon - that tells Access we are entering a user defined field) and then right click and select our old friend the 'Build' dialog. Another old friend, the Iif function is used to decide how this is filled - you need to end up with this in the build dialog; Winner: iif([PL]>0,1,0). Run the query again and you should see something like this:


Just in case you were wondering what this work would look like if written in SQL... here it is in all its glory:

SELECT Trades.TradeID, Trades.TDate, Sports.Sport, Trades.Event, Strategy.Strategy, Trades.Stakes, Trades.PL, IIf([PL]>0,1,0) AS Winner
FROM Strategy INNER JOIN (Sports INNER JOIN Trades ON Sports.SportID = Trades.SportID) ON Strategy.StratID = Trades.StratID
WHERE (((Trades.TDate) Between [Start date:] And [End date:])) OR ((([Start date:]) Is Null) AND (([End date:]) Is Null))
ORDER BY Trades.TDate;
Bet you're glad we can do it graphically!

2 comments:

  1. Let's give Access its due: it's really a very capable little database package, particularly for small single-user applications such as the one presented here. By "small", I'm thinking of sizes up to, say, 100MB, although the program can handle a lot more. Beyond that, I'm happier to move into the more "industrial" class of DBMS, MySQL & PostgreSQL being the more obvious free options. Oracle, which I use professionally, appears bloated almost beyond redemption with all its "enterprise" features, and in any case the free version, like MS SQL Server Express, is limited to 2GB. All of these, however, have less integrated features than Access (although lots of tools of varying quality are available).

    ReplyDelete
  2. Thanks for the comment, Mike. I agree with your comments about the capabilities of Access, and I didn't mean in any way to belittle it!

    To be honest when I undertook this little exercise I wondered about using SQL Express as I don't think the 2GB limit will worry too many sports traders! The reason I didn't is because I can't get the 'Management Suite' installed and running on my Win 7 machine despite much googling and forum hunting - it's now on the 'too difficult' list! I therefore decided that Access with its highly graphical interfaces was the best option.

    OK so far, btw?? :-)

    ReplyDelete