Friday, 18 March 2011

How To Ease The Pain of SAS in Production - Part 2

In the first part we looked at the some of the data issues that I've found using SAS in production. Here are some of the coding issues.

SAS code is programming. Use descriptive names for variables and tables. Indent. Comment.
Since you’re going to delete all your working tables, don’t re-use them in a proc sort’s, data’s and sql’s. I call the de-duped result of proc sort on table X, “sorted_X”. Then I know where I am and so do you when you read the code.

You should not put usernames and passwords in code. And passwords expire. Use an input form to get those – make friends with the %Window macro.

Unless everyone in the team has the same SAS database modules use CONNECT TO ODBC (PROMPT) to access external databases.

On Mondays and Wednesdays I prefer PROC SQL to DATA steps, and on Tuesdays and Thursdays I prefer DATA steps. On Friday, I have no opinion. DATA steps are usually faster, PROC SQL is brain-space saving cross-use of a skill between SAS and the outside world.

SAS encourages you to fix problems in a production run by hacking at the original script: a quick resubmit here to run that little bit, a change of dates there, a little extra code… Do NOT do this: by the time you’ve finished the whole thing, you’re too tired and confused to un-hack the script. Next time you use it, it will not work. Fix the script. Re-run it from the start EVERY TIME. Then when it works, you know it will work next time.

SAS is too clever for its own good with dates. If it sees that a destination field in another database is a DATE field, it will assume the source field is a SAS date and translate that date into the destination database’s date format. This means you need to get Oracle dates like this: DATE - to_date('1960-01-01','yyyy-mm-dd').

If you’re starting by taking data from a database into a SAS table, do as much processing of the data in the SQL as possible, even data type conversions. I use a lot of pass-through queries to an Oracle database, and I find it a lot easier to write Oracle SQL than work out how to do the same thing in SAS. Also, lots of data manipulation is SAS makes for really scrappy looking code.

Sure, when SAS has a powerful bit of functionality it is really powerful (DATA / UPDATE combines a SQL append and update query in one simple bit of code), but when it doesn't, it hurts. This is how you do a string concatenation that references a couple of variables (startdate and enddate) in VBA…

Dim let dateConditionString as string
dateConditionString = "and DATE_B is null and DATE_A BETWEEN to_date('” & startdate & “,'yyyy-mm-dd') AND to_date('” & enddate &”,'yyyy-mm-dd')”

and this is how you do it in SAS…

%let dateConditionString = "and DATE_B is null and DATE_A BETWEEN to_date('&startdate','yyyy-mm-dd') AND to_date('&enddate','yyyy-mm-dd')";
data _null_;
call symput("dateConditionString", compress(&dateConditionString, '"'));
run;

This is not even scripting. It's DOS-level batch coding.

Wednesday, 16 March 2011

My Non-Alienating Moleskine Cahier

I use a Moleskine cahier to hold my reminder list. Each item gets a number - because I have a little bit of that obsessive-compulsive thing - and when it's done, I draw a line through it. When every item on a page is done, I put a diagonal line across the page. When I change my mind about an item, I put a cross through the number and a line through it. I jot down stuff when it occurs to me. The List is usually on the left page and the right page is for telephone numbers, odd details, books I want to price on Amazon and anything else.

My appointments diary is on iCal on my MacBook Pro, co-ordinated with Google Calendar and transferred to my phone now and again.

My contacts list is in Address Book on my MacBook Pro, co-ordinated with Google Contacts. And I use Gmail and access that through Mail.

That's it. That's how I manage my life. I used to wish I had a life that needed a Time Manager, or even a bulging Filofax, let alone an iPhone and multiple Google Calendars, but I now don't.

You see, reminders are one thing, but To-Do Lists and Plans and Projects are another, and we should not fall for it. My life is not a project, and I am not a project, even though I undertake multi-part activities over a period of time that have a purpose. It's one thing to plan the re-decoration of your hallway, but another to make it a Project. Projects have budgets, plans, targets, and can fail or succeed. Projects make you their servant, and the plans become a way for you to judge yourself. Make something a "project" and you alienate it from yourself.

Stick to re-decorating your hall. Or re-making the garden. Or writing a series of posts on algebraic geometry. Or finding another job. Those are parts of your life.

Monday, 14 March 2011

W H Auden's Canzone

I don't really do poetry. T S Eliot, of course, he's like Beethoven, even if you don't do classical music you like Beethoven, and even if you don't like poetry, you can be impressed by Eliot. And Mayakovsky, of whom I have a two-volume Russian edition of his complete poems. Sylvia Plath, but then I like Joni Mitchell as well, so call me a sensitive girl. And W H Auden. You would think that if I like Auden, I would like Spender and all the war poets and probably Keats as well. But I don't. I just like Auden. Canzone is my favourite - but you have to get the right edition of his works to find it - and I'm not sure I can explain why. It has a view of life and our place in it that rings true to me, but I'm not actually sure it makes a lot of sense, rather like Joni Mitchell's The Jungle Line. It sounds wonderful, so who cares?

Canzone - W H Auden
When shall we learn, what should be clear as day,
We cannot choose what we are free to love?
Although the mouse we banished yesterday
Is an enraged rhinoceros today,
Our value is more threatened than we know:
Shabby objections to our present day
Go snooping round its outskirts; night and day
Faces, orations, battles, bait our will
As questionable forms and noises will;
Whole phyla of resentments every day
Give status to the wild men of the world
Who rule the absent-minded and this world.

We are created from and with the world
To suffer with and from it day by day:
Whether we meet in a majestic world
Of solid measurements or a dream world
Of swans and gold, we are required to love
All homeless objects that require a world.
Our claim to own our bodies and our world
Is our catastrophe. What can we know
But panic and caprice until we know
Our dreadful appetite demands a world
Whose order, origin, and purpose will
Be fluent satisfaction of our will?

Drift, Autumn, drift; fall, colours, where you will:
Bald melancholia minces through the world.
Regret, cold oceans, the lymphatic will
Caught in reflection on the right to will:
While violent dogs excite their dying day
To bacchic fury; snarl, though, as they will,
Their teeth are not a triumph for the will
But utter hesitation. What we love
Ourselves for is our power not to love,
To shrink to nothing or explode at will,
To ruin and remember that we know
What ruins and hyaenas cannot know.

If in this dark now I less often know
That spiral staircase where the haunted will
Hunts for its stolen luggage, who should know
Better than you, beloved, how I know
What gives security to any world.
Or in whose mirror I begin to know
The chaos of the heart as merchants know
Their coins and cities, genius its own day?
For through our lively traffic all the day,
In my own person I am forced to know
How much must be forgotten out of love,
How much must be forgiven, even love.

Dear flesh, dear mind, dear spirit, O dear love,
In the depths of myself blind monsters know
Your presence and are angry, dreading Love
That asks its image for more than love;
The hot rampageous horses of my will,
Catching the scent of Heaven, whinny: Love
Gives no excuse to evil done for love,
Neither in you, nor me, nor armies, nor the world
Of words and wheels, nor any other world.
Dear fellow-creature, praise our God of Love
That we are so admonished, that no day
Of conscious trial be a wasted day.

Or else we make a scarecrow of the day,
Loose ends and jumble of our common world,
And stuff and nonsense of our own free will;
Or else our changing flesh may never know
There must be sorrow if there can be love.

Friday, 11 March 2011

How To Ease The Pain of SAS in Production - Part 1

One of the many toys we got when The Bank bought The Other Bank way back in the mists of time was the need to use SAS and Business Objects. Full-featured Business Objects might be neat, but what we got, really is a frustrating mess and is painfully slow. Then there's SAS. People make very good money using SAS. It's mainly used in financial services, retail analysis and pharmaceuticals. It's inventor is not quite as rich as Bill Gates, but close. They have made a decision to put the resource into incredibly fast implementations of complicated statistical methods and basic data handling, rather than into user interfaces and a slick scripting language and IDE. That's a choice.

SAS may be a great analytical tool, but it really sucks when used for production. People will tell you that it’s really good “once you get used to it”. Which is a different way of saying "it really sucks until you stop minding". Unfortunately, SAS gets used for production: these are some hints about dealing with it.

The main issue is this: if SAS is being used for production, you’re in a non-professional data environment. expect the random SAS tables you're going to be using to have duplicated records, no defined key field and every sort of data abuse imaginable and quite a few more that aren’t. People really think it’s cute to store ID numbers as strings with leading 0’s.

If in doubt, use a proc sort with the NODUPKEY option on what should be the key field. I call this a "brute force de-dupe" because that's what it does. You will need it to de-duplicate the results of your queries on these megabyte messes. (If you’re using a table without a key field, you’re on your own.) And always do a brute-force de-dup to get the final table you’re actually going to use.

If you’re importing data from other people’s workbooks, assume they will change the order of the fields, the field names and the number of fields every time they give you the report. To be fair, this applies in any environment, but you can’t do much about it in SAS.

If you’re going to modify a reference table with a SAS script, take a backup first.

Never write a script that ends by committing updates. You CANNOT rely on the script to a) run all the way through without errors, b) produce the output in a format you think it should, c) produce the data you thought you were going to get, d) not suddenly create duplicate records because someone did something you would never think of doing. Finish a production script with a stats query that looks for tell-tale null field values, brings back volume and value totals you can check and so on. Commit the changes to your master table with another script.

Put all your temporary working tables into the rwork library. That way you won’t create long-term clutter. Whatever you write will be more portable, as everyone has an rwork library. (Unless they don’t have a server – then use work.)

Part Two follows...

Monday, 7 March 2011

March? It's March Already? How Did That Happen?

Oh no! It's almost half-way through March already. The year is almost over! I've organised no holidays, my life is vanishing before my eyes!

I do this every year. January drags by interminably, February comes and suddenly it's half-way through March and I have done exactly nothing all year. That I had planned on doing. Face it, the last six weeks have sucked. Big time. On Jan 25th I find out I haven't got a cancerous bump on my skull, but by Thursday 16th February, it's turned into an sebaceous cyst, which burst on the 24th. I catch The London Winter Cold on the weekend of the 19th and that stays with me for the next two weeks to he point where I even have a day's sick leave on the 22nd, and I'm still coughing a bit now. My back locks up and is painful between about Monday 8th February to about the 20th. I spent last week dazed on Night Nurse because how else was I not going to wake up at 01:00 hours coughing to clear my lungs? The weather has been grey, grey, grey. February sucked. Really. Sucked.

I've only just come out of it today. The weather was clear blue all day. I've been for a slow walk round Virginia Water, a pizza in Twickenham, a while sitting in a sun-trap corner of my garden reading a book on Leonor Fini I've had for ages, before giving the grass it's first post-winter cut and toddling off to see The Adjustment Bureau and falling in love with Emily Blunt. When I came out of the cinema, the sky was a hundred shades of sunset blue-green-orange-brown.



Which is what I call a day off. I'm still looking for holidays. I've tried looking at retreats. I'm getting the feeling that they are likely to be a bit, girly, well, middle-age womanly. On the other hand, if that means the usual hotel crowd stays away, put off by tales of yoga and self-revelation, maybe that's a plus.

Friday, 4 March 2011

Things I Saw Where I Lived and Walked: Part 9


The wooden boat was on the Parkland Walk, a disused railway line between Highgate and Finsbury Park, which ran past the block of flats where I lived for a couple of years in the early 1980's. Low tide at Watchet harbour, Somerset one summer morning in the late eighties. Snow in Bushy Park and a peek at Richmond Baths through autumn trees - I suspect late eighties as these are all Olympus OM10.

Wednesday, 2 March 2011

Remarks on the Phenomenology of Dating

One of The Gang asked me when was the last time I went on a date? I gave some sort of politely evasive answer like "my memory isn't that good", which evasion is allowed by may age and maturity. The real answer is "what the frak would I do that for?" Or rather "why the frak would I go through the motions?"

A date is two adults, after 7:30 in the evening, with the possibility of sex. If there's no possibility of sex, it's not a date, it's just a meal or a movie or a night at the opera. Now, at the end of every weekday evening I have to catch a commuter train - cab fares from central London are silly - so it's going to be an early evening. Sunday night is a school night, and so is Friday night because I have housework to do Saturday morning. Saturday night is for the under-thirties and people who don't get out during the week: grown-ups don't do Saturday night dates. This leaves zero possibility of sex. Even before we factor in the whole age thing. Let alone the differential looks thing (women my age look it, I don't: women who look the age I look are ten to twenty years younger, and they aren't going to date me). Why would I go on something that looked like a date, when I know in advance that it isn't? I can go to the theatre just fine on my own thank you.

It's actually worse than that. Though I would be good company, I doubt there would be one moment when I actually thought of whoever she was as a woman. A woman is a female with whom sex is a possibility: once it's no longer a possibility, she's not a woman, she's just a guy with estrogen.

I've begun to realise that this is actually a general phenomenum. I'll give you another example. I'm Pilates Class Guy. You've seen me: one guy, maybe two, in bloke-ish shorts and sloppy sweat shirt, wearing socks. The rest of the class are women of varying ages and looks, all wearing clothes that fit and with bare feet. If you're Pilates Class Guy, you realise after a couple of sessions that your natural masculine instinct to check out the women is a little, well, creepy in such a confined space. Especially when half of them would be as old as your daughter if you had a daughter. So you focus on the exercises, stop looking at the women and after a while, and I mean, in less than a class, they've stopped being women.

A person is someone with whom we have dealings, or the think we might have dealings, or wish we didn't have to. (The technical term for the last type is "assholes".) The someones on the escalators and stairs on London Transport aren't people: they're just mobile obstructions to be dodged round. The staff behind the counter at Fernandez and Wells are people: I have dealings with them. I don't know their names, but they are people. The Gang at work are people and personalities: I have dealings with them, gossip and banter with them, I take what I know about them into account in my dealings with them. The women in the Pilates class aren't people and even if they were, they wouldn't be women. Women are females-with-whom-sex-is-a-possibility. There is no possibility of sex with any of them.

In fact, actually going out on a date with someone I might, in other circumstances, have liked to go on a date with, would spoil it. As a fantasy date she's a woman, as a real date, she's someone I'm saying "Thank you, I had a lovely time" to, before I catch a ten o'clock train home, and has stopped being a woman. Which is a nifty little Catch-22. Or to put it another way: if you know you're coming back home alone at the end of the night, why they hell did you bother going out in the first place?

Now that last bit may be as much my fault, in that it's a blind alley I've driven myself into, but that's not the point. The point is the how it shapes my view of and feelings towards the world. It makes it a more empty place.