Archive for the ‘Databases’ Category

Keeping up with database changes

Friday, June 11th, 2010

Scenario: several developers are hard at work cranking out code. The application under development relies on RDBMS back-end for persistent storage (in this particular case, the database is Microsoft SQL Server 2005, but the technique described applies to any RDBMS supporting DDL triggers). Developers are making changes to the client application code, creating/altering/dropping database objects (stored procedures, tables, views etc.) and, in the heat of the moment, forgetting to communicate the changes to their teammates left alone the project manager…

Yes, I know – this is not how it supposed to happen, and yet in the world out there, more often than not, it does happen… Here are some do-it-yourself ideas on how you could alleviate the pain and spare you some nasty surprises without buying more tools…

Enter DDL Triggers. This is relatively new feature with Microsoft SQL Server (though Oracle had them for ages), and, among many other things (rolling back changes, for instance), it could be used to solve the problem stated above.

A DDL (Data Definition Language) trigger in MS SQL Server can have two scopes – server and database. The Table 1.1 at the end of this post lists all the events for which DDL trigger could be created, grouped by scope. For the full syntax in creating a DDL trigger please see vendor’s documentation; here I will only touch basics needed to illustrate a solution.

Here’s a database scop trigger we are going to use to monitor events:

CREATE TRIGGER [tr_DDL_ALERT] ON DATABASE —- trigger is created in context of a given database
FOR CREATE_TABLE, DROP_TABLE, ALTER_TABLE    —- which events to capture; see Table 1.1 for full list
AS         —-
use DDL_DATABASE_LEVEL_EVENTS captures all DB events
SET NOCOUNT ON
DECLARE @xmlEventData XML —- the generated event data is in XML format
SET @xmlEventData = eventdata() —- get data from the EVENTDATA() function

Now, this trigger would not be much of use to anybody; you need to parse information contained in the XML message passed into your trigger upon the event. You could parse it and send an email message, or you could save it into a database, or both.

The following code saves it into a table [tbDDL_ALERT] – which, of course, has to be created beforehand:

INSERT INTO dbo.tbDDLEventLog
(
EventTime
,EventType
,ServerName
,DatabaseName
,ObjectType
,ObjectName
,UserName
,CommandText
)
SELECT REPLACE(CONVERT(VARCHAR(50), @xmlEventData.query(‘data(/EVENT_INSTANCE/PostTime)’)),’T‘, ‘ ‘)
,CONVERT(VARCHAR(100), @xmlEventData.query(‘data(/EVENT_INSTANCE/EventType)‘))
,CONVERT(VARCHAR(100), @xmlEventData.query(‘data(/EVENT_INSTANCE/ServerName)‘))
,CONVERT(VARCHAR(100), @xmlEventData.query(‘data(/EVENT_INSTANCE/DatabaseName)‘))
,CONVERT(VARCHAR(100), @xmlEventData.query(‘data(/EVENT_INSTANCE/ObjectType)‘))
,CONVERT(VARCHAR(100), @xmlEventData.query(‘data(/EVENT_INSTANCE/ObjectName)‘))
,CONVERT(VARCHAR(100), @xmlEventData.query(‘data(/EVENT_INSTANCE/UserName)‘))
,CONVERT(VARCHAR(MAX), @xmlEventData.query(‘data(/EVENT_INSTANCE/TSQLCommand/CommandText)‘))

And sends out email notifications using potentially obsolete extended stored procedure (assemble message (@body variable) from the elements of the XML message as shown in the example above):

EXEC master..xp_smtp_sendmail
@TO = ‘me@somewhere.com
,@from = ‘someone@somewhere.com
,@message = @body
,@subject = ‘database was modified
,@server = ‘smtp.mydomain.com’

Long-term solution would be, of course, configuring SQL Server Database Mail.

In my next post I will describe how database triggers could be integrated with Hudson - an open source Continuous Integration (CI) server.

Table 1. List of the values to use with server and database scope DDL triggers

Server Scope Database Scope
ALTER_AUTHORIZATION_SERVER
CREATE_DATABASE
ALTER_DATABASE
DROP_DATABASE
CREATE_ENDPOINT
DROP_ENDPOINT
CREATE_LOGIN
ALTER_LOGIN
DROP_LOGIN
GRANT_SERVER
DENY_SERVER
REVOKE_SERVER
CREATE_APPLICATION_ROLE
ALTER_APPLICATION_ROLE
DROP_APPLICATION_ROLE
CREATE_ASSEMBLY
ALTER_ASSEMBLY
DROP_ASSEMBLY
ALTER_AUTHORIZATION_DATABASE
CREATE_CERTIFICATE
ALTER_CERTIFICATE
DROP_CERTIFICATE
CREATE_CONTRACT
DROP_CONTRACT
GRANT_DATABASE
DENY_DATABASE
REVOKE_DATABASE
CREATE_EVENT_NOTIFICATION
DROP_EVENT_NOTIFICATION
CREATE_FUNCTION
ALTER_FUNCTION
DROP_FUNCTION
CREATE_INDEX
ALTER_INDEX
DROP_INDEX
CREATE_MESSAGE_TYPE
ALTER_MESSAGE_TYPE
DROP_MESSAGE_TYPE
CREATE_PARTITION_FUNCTION
ALTER_PARTITION_FUNCTION
DROP_PARTITION_FUNCTION
CREATE_PARTITION_SCHEME
ALTER_PARTITION_SCHEME
DROP_PARTITION_SCHEME
CREATE_PROCEDURE
ALTER_PROCEDURE
DROP_PROCEDURE
CREATE_QUEUE
ALTER_QUEUE
DROP_QUEUE
CREATE_REMOTE_SERVICE_BINDING
ALTER_REMOTE_SERVICE_BINDING
DROP_REMOTE_SERVICE_BINDING
CREATE_ROLE
ALTER_ROLE
DROP_ROLE
CREATE_ROUTE
ALTER_ROUTE
DROP_ROUTE
CREATE_SCHEMA
ALTER_SCHEMA
DROP_SCHEMA
CREATE_SERVICE
ALTER_SERVICE
DROP_SERVICE
CREATE_STATISTICS
DROP_STATISTICS
UPDATE_STATISTICS
CREATE_SYNONYM
DROP_SYNONYM
CREATE_TABLE
ALTER_TABLE
DROP_TABLE
CREATE_TRIGGER
ALTER_TRIGGER
DROP_TRIGGER
CREATE_TYPE
DROP_TYPE
CREATE_USER
ALTER_USER
DROP_USER
CREATE_VIEW
ALTER_VIEW
DROP_VIEW
CREATE_XML_SCHEMA_COLLECTION
ALTER_XML_SCHEMA_COLLECTION
DROP_XML_SCHEMA_COLLECTION

Time to Move on: James Gosling leaves Oracle

Wednesday, April 14th, 2010

As of April 2, 2010 the ”Father of Java” is no longer with Oracle. This follows departures of Monty Widenius (2009) and Ken Jacobs (2010)   Oracle might have acquired the body but the soul is gone…

A cute slide presentation from eWeek: The Life and Times of Java and James Gosling

MDM market consolidation

Wednesday, February 3rd, 2010

Informatica acquires Siperian.  I had my bets on SAP (it was not a good fit for Microsoft or IBM) snapping it or, maybe, Oracle. I wonder what was the driving force: the software capabilities or its client base (with all this rush for EHR, HIE, HIMSS and the rest of the healthcare alphabet soup :) ?

Tunnel vision(s)

Tuesday, January 26th, 2010

At the inaugural meeting of the New York Technology Council Thursday night, Google Vice President of Research Alfred Spector and Microsoft architect evangelist Bill Zack debated their views on how data will be stored and shared in the future.

Google leads shift to the “web as platform” paradigm, and Microsoft has a grip on desktop, and – to a significant degree – on the server market. Not surprisingly, they see the world through their respective rosy glasses: Google wants everything to be in the cloud (“network computing”, anyone?), and Microsoft puts forward his “three screens” strategy blancing its cash cows – Windows + MS Office – with a bet on cloud computing, the new Azure platform. Google does not have the legacy ties, it was in the cloud business from day one, though recent developments such as Android and Chrome OS indicate that they might be bridging the gap in opposite direction…

If the only tool one has is a hammer suddenly every problem starts looking like a nail..

FUD for thought

Thursday, January 21st, 2010

“To be uncertain is to be uncomfortable,but to be certain is to be ridiculous. ” Chinese Proverb

The European Commission today (January 21, 2010) cleared Oracle’s agreement to acquire Sun Microsystems. What does it mean for the development community, specifically for the future of Sun’s crown jewels: MySQL, OpenOffice, GlassFish EE server, NetBeans… Oracle had almost a year to figure things out.

NetBeansis especially vulnerable given tha Oracle has competing JDeveloper (and Bea Java Dev tool); maybe it will be released as open source project to the community? Rolled into JDeveloper? Discontinued?

Why would Oracle need GlassFish when it already has Bea and Oracle AS? Cannibalization is very likely.

MySQL? Anybody’s guess, but I bet that it will be supported and development will continue; maybe will undergo Oracle-ization (for example, replace MySQL procedural extensions - just introduced in version 5.0 - with robust mature PL/SQL). Will it still be free? Given $1 bln Sun had spent acquiring it, and $7+ bln Oracle spent acquiring Sun, it seems plausible to assume that Oracle would try to squeeze some dough out of it. Its own flagship database sales were stung by ascending SQL Server and IBM.. I see PostgreSQL as a winner, the only enterprise capable true open source RDBMS on the market.

Java. Once positioned as a spear at Microsoft’s heart; not anymore – the landscape has changed, notably with Google becoming a major player, and Microsoft wisely playing its cards by releasing C# as open standard. Yet, I do not see Oracle donating Java to the open source community, most likely we’ll see variations of Sun’s controlled “Community Development Process”. Oracle made significant investment into Java, supporting it inside its products, and even creating its own IDE… but what is going to happen to infant JavaFX ? RIA market is getting saturated – Flash/FlexSilverlight, AJAX (and Ajax support frameworks such as GWT)… Apache Pivot looks darn promising..  Will Oracle have enough resources to spread around?

Solaris. SUN’s very own implementation of Unix operating system, arguable the best out there, AIX and HP-UX market penetration notwithstanding. For a long time Oracle and Solaris were inseparable; if an Oracle DBA did not run his database on Solaris he was somewhat deemed less competent. Then Linux came of age, and Oracle made huge bet on it (remember “Linux makes Oracle Unbreakable!”,or  was it other way around?). Now they OWN the platform that they flagship database was designed for. Will they ditch Linux? Unlikely. Linux is on upswing, it is robust, reliable and has enterprise level support. Will Oracle push Solaris? Not exactly their domain of expertise, and market of operating systems is not as lucrative as it used to be. Then there is issue of the Sun’s proprietary hardware – hugely overpriced, increasingly obsolete… Sun recognized that they cannot charge premium prices for the hardware that is becoming a commodity, and released x86 version of Solaris; it flopped (why x86 Solaris when I can run x86 Linux?). Apple seems to be able to create perception of superiority of both software (Mac OS) and hardware (Apple), but I credit Steve Jobs for it (to support my suspicion, follow the ups and downs of Apple stock plotted against timeline of Steve’s health news; also, reliability of Apple laptops lags that of Asus , Toshiba and Sony - yet there is unshakeable perception that Mac is light years ahead of lowly PC… yalk about selling sizzle!)

My bet is that Solaris will be retired over period of time in favour of Linux…. R.I.P.

NB: FUD  – Fear, Uncertainty and Doubt

SQL Server: passing data between procedures

Thursday, January 21st, 2010

The common programming task – passing parameters between functions – is far from simple in Transact-SQL. One has to pay close attention to a particular version of the RDBMS that implements the language. To add to confusion, ever since Microsoft SQL Server and Sybase had parted their ways (version 7.0 and 11.5, respectively), there are two ever diverging dialects of Transact-SQL.

This article How to Share Data Between Stored Procedures  by  Erland Sommarskog goes into excruciating details explaining different options a programmer has when there is a need to pass data between stored procedure. Saved my team some time, and provided an opportunity to learn. Thank you!

The following table is taken verbatim from the original post by Mr. Sommarskog, and links back to his site:

Method Input/ Output SQL Server versions Comment
Using OUTPUT Parameters Output All Not generally applicable, but sometimes overlooked.
Table-valued Functions Output SQL 2000 Probably the best method for output, but has some restrictions.
Inline Functions Use this when you want to reuse a single SELECT.
Multi-statement Functions When you need to encapsulate more complex logic.
Using a Table In/Out All Most general methods with no restrictions, but a little more complex to use.
Sharing a Temp Table Mainly for single pair of caller/callee.
Process-keyed Table Best choice for many callers to same callee.
Global Temp Tables A variation of Process-Keyed.
INSERT-EXEC Output SQL 6.5 Does not require rewrite. Has some gotchas.
Table Parameters and Table Types In/(Out) SQL 2008 Could have been the final answer, but due to a restriction it is only mildly useful in this context.
Using the CLR Output SQL 2005 Does not require a rewrite. Clunky, but is useful as a last resort when INSERT-EXEC does not work.
OPENQUERY Output SQL 7 Does not require rewrite. Tricky with many pitfalls.
Using XML In/Out SQL 2005 A roundabout way that requires you to make a rewrite, but it has some advantages over the other methods.
Using Cursor Variables Output SQL 7 Not recommendable.

Look Ma, no SQL!

Tuesday, December 29th, 2009

Is the Structured Query Language  goes the way of dinosaurs?
First proposed back in 1970s, the relational database technologies have flourished, taking over the entire data processing domain (with an occasional non-relational data storage hiding in long shadows of the [t]rusty mainframes). The days of glory may be over, and the reason could be  … yes, you’ve guessed it - a paradigm shift.

The relational databases brought order into chaotic world of unstructured data; for years the ultimate goal was to normalize data, organize it in some fashion, chop it into entities and attributes so it could be further sliced and diced to construct information… There was a price to pay though t - need for a set-based language to manipulate the data, namely, Structured Query Language - SQL  (with some procedural and multidimensional extensions trown in…)

The Holy Grail was to get data to 5NF, and then create a litter of data warehoses – either dimensional or normalized to analyze the data…. Then again, maybe we could just leave the data the way it is, stop torturing it into relational model – and gain speed and flexibility at the same time?  That’s what I call a paradigm shift!

Enter MapReduce: Simplified Data Processing on Large Clusters, another idea from Google (which also inspired Hadoop - open source implementation of the idea)

Google is doing it, Adobe is doing it, FaceBook is doing it, and hordes of other, relatively unknown, vendors are doing it ( lots of tacky names – CouchDB, MongoDB, Dynomite, HadoopDB, Cassandra,Voldemort, Hypertable :)

IBM, Oracle and Microsoft have announced additional features for their flagship products: the M2 Data Analysis Platform based upon Hadoop, and Microsoft extending its LINQ  (which goes past relational data) to include similar features… Sybase has recently announced that it implementes MapReduce in its SybaseIQ database.

To be true, the data still undergo some pre-processing to be fully managed by these technologies, but to a much lesser degree. The technology is designed to abstract intricacies of parallel processing, and to facilitate managementr of large distributed data sets;  it aims not to eliminate need for relational storage but the need for SQL to manipulate the data… the idea is to allow analytic processing of the data where it lives, without expensive ETL, and with minimal performance hit. The line is blurring between ORM, DBMS, OODBMS and programming environment; between data and data processing..

With all that said, it might not be the time to ditch your trusty RDBMS ( just yet…:)  A team of researchers concluded that “Databases “were significantly faster and required less code to implement each task, but took longer to tune and load the data,” the researchers write. Database clusters were between 3.1 and 6.5 times faster on a “variety of analytic tasks.”

Data Warehousing: an introduction

Tuesday, December 22nd, 2009

Here’s a brief (32 slides) introduction into Data Warehousing Concepts for Managers. The target audience of the presentation are managers and architects; the goal was to resolve confusion about basic DW concepts and terminology, and get everybody on the same page.

And this presentation (47 slides) targets developers Data Warehousing Concepts for Developers. There is significant overlap between the two, with developer’s version going into greater detail presenting underlying technologies.

Further reading:

For dimensional Data Warehousing  go to the source:

The Data Warehouse Lifecycle Toolkit  by Ralph Kimball 

For normalized Data Warehousing:

 Building the Data Warehouse  by W.H. Inmon

Sharepoint 2010: prayers answered… sort of

Thursday, October 22nd, 2009

The  SharePoint  2010 is coming!.. In fact, it is almost here – the rumour has is that beta was already released to MSDN subscribers last Wednesday (10/21/2009, that is). Not a revolutionary upgrade in a sense that SP2007 was to SP2003, but a major milestone nevertheless.

The interest in the product speaks for itself – the conference 2009 was sold out – and that’s 7,000+ attendees! I was a bit surprised to see that many people  from Europe – for some reason there were Danes at every turn – but I’ve also met folks from France, Germany, England, Spain, Czech Republic, Netherlands, Belgium, Finland, Israel… there were quite few Kiwis and Aussies, Brazilians and Argentineans..

Having inherited a team of hard core Java developers I am facing quite a challenge to help them in transition to proficient Sharepoint ones. I wish I could fly all of them to Las Vegas, just to feel the energy in the room, attend all sponsored events (yes, free drinks included), and see elegancy of some solutions! This would be the best way to assure them of an interesting and profitable career path… Damn these budget cuts!

Apparently, Microsoft heard the developers community, or, maybe, lusted after the fat profits that third party vendors made patching the holes left in the MOSS 2007… But the biggest improvements for me came from the developer’s perspective. The biggest gripe about Sharepoint development came from the fact that it is not a development environment. It is very painful to code against, debug and deploy in Sharepoint 2007, not to mention absence of testing framework (there are third party tools, to be sure), and then this insistence on using VSTS… (no, you do not need it – we are developing with VS2008 Professional). I am happy to report that SP2010 and Visual Studio 2010 have addressed this problem. Beginning from 2010, one can develop decent workfl0ws in Sharepoint 2010, and still maintain sanity (most of the time). First, you can develop on Windows 7 or Windows Vista SP1 (64 bit only, of course); no need for sluggish clunky VM that hogging down your computer (btw: you can boot to VHD, if you’d like to keep your SP development separate from the rest of your work). Second: no more add-in packages to do development in Visual Studio , no multiple deployment steps to install and activate provisioning code – it all built in; no SDK, no missing references.. Finally:  you can deploy from within Visual Studio (God, I missed this one!), and you can debug your stuff without jumping through additional hoops.

A big thing: WSP now can be created in SharePoint Designer (still free), and edited  in Visual Studio (and vice versa). Finally.

BDC are now BDS – business data services, and are better than ever. You can combine data from different data sources into a single consistent interface. Oh, and don’t forget the ability to create EXTERNAL lists. Sharepoint 2010 is all about RESTful services, and json; a novel way to provide and consume data; ability to create lists with multiple columns is very welcomed, too.

I am very intrigued by Access Services in Sharepoint 2010; I see how it can help me to track down every pesky MS Access app crawling in the enterprise, and manage them centrally. Very cool (more about it later).

LINQ is staging a comeback. After hinting that it will be deprecated in favor of Entity Framework, Microsoft again puts it at the heart of data interfaces. Virtually every demo that made use of data did it through LINQ; and it was prominently displayed on the slides presented by the m’softies. Something to think about.

All in all, it appears that our prayers (and curses) has been heard.