|
马上注册,结交更多好友,享用更多功能,让你轻松玩转社区。
您需要 登录 才可以下载或查看,没有帐号?立即注册
x
用一个库#bak_database存放这些历史数据。创立|数据
怎样创立一个乐成的数据堆栈(datawarehose),上面的故事将告知你!
Thecompanysfirstdatawarehouseprojectbeganwithacasualconversationbetweenseveralexecutivesontheirwaytolunch.ThepeopleinvolvedweretheITmanagerfordecisionsupportaswellasseveralmembersofadepartmentthathadjustdecidedtoinstalladatawarehouse.TheyhadplannedtoinstalltheirdatawarehousewithoutanyinvolvementfromtheITdepartment;nonetheless,thefollowingconversationensued:
"Weurgentlyneedadatawarehousetoanalyzeourdata!""Inthatcase,whydontyoutakeanOLAPtoolwithamultidimensionaldatabase?""Isitpossibletomakethesalesfiguresavailabletooursalespeople?""Yes,ofcourse,thatsnoproblembecauseofitsWebcapacity.""Weneedouranswersveryfast.""Performanceisntanissue,thedatarequestedcanbemadeavailableonalocalserver.""Great,whencanwestartouranalyses?""Installingsuchasystemshouldnttakemorethanafewweeks."
Encouragedbythesecasualtipsfromanexpert,thedepartmentdecidedtobuildadatawarehousethatcorrespondedexactlytoitsspecificneeds.Somemonthslater,thedatawarehousewasinstalledaccordingtotheoriginalspecifications.Afterthefirstsuccesseshadbecomepublicknowledgewithinthecompany,otherdepartmentsbegantoshowinterestinthedatawarehouse.Proudly,thesystemwasdisplayed,andenthusiasmwasspreading.Suddenly,eachdepartmentwantedtheirowndatawarehouse,andrequestsbegantopileuponthedesksoftheITmanagers.However,apartfromthecasualconversationpreviously,theITdepartmenthadnotbeeninvolvedinthedevelopmentofthisfirstdatawarehouse.Theprojectitselfhadbeenimplementedbythedepartmentwiththehelpofanexternalsystemintegrator.Nobodyhadplannedonintegratingadditionalusergroups.Ithadbecomeimperativetofurtherdevelopthissuccessfuldatawarehouse.Atthispointitbecameclearthatthedepartmenthadlockeditselfintoadatamartwithonlylimitedscalability,insteadofbuildingadatawarehousewithunlimitedcapacityforexpansion.Thisdifferencebetweendatamartsanddatawarehousesisabasicissuethatthewholecompanynowhadtoface.
WhattodoNext?
Basedonthesituationmentioned,somequestionsarise,suchas:Canadatawarehouseoriginallyconceivedonlyforonedepartmentbeusedforthewholecompany,orshouldnewdatamartsbebuiltforeachdepartment?Ifthelattersolutionispreferred,howwouldonedepartmentaccessdatafromanotherdepartment?Whowillguaranteethatalluserswillreceiveexactlythesameinformation?Thequestionaboutwhethertostartwithdatamartsoradatawarehousehasbeenwidelydiscussed.1,2,3Itcanonlybeansweredbyclearlydefiningtheprojectsgoals:doesithavetocovertheinformationneedsforcertaindepartmentsorisitseenasthefirststeptowardasharedenterpriseinformationpool.Ifonlydepartmentalneedsaretheissue,itwillsufficetoinstallsomeisolateddatamarts.However,ifacompanyregardsaccesstoanintegrated,company-widedatabaseascriticalforitsfuturesurvivalinthemarket,thenanenterprisedatawarehouseisthesolutiontoimplement.
Mythoughts,sofar,mayhavecreatedtheimpressionthatthereareonlytwooptions:eitherquicklyinstallafewdatamartstocoverafewdepartmentscurrentneedsforinformationorembarkontheexpensiveadventureofinstallinganenterprisedatawarehouse.Thereisathirdoptionthatcombinesthebestofbothworldsandcanbeimplementedquicklywithoutsacrificingfuturegrowthoptions.Thisthirdoptionistolaydownthefoundationofanenterprisedatawarehousebystartingwithascalabledatawarehouseframeworkinapilotproject.
HowtoProceed
Theproceduresthatleadtothisscalabledatawarehousepilotprojectarespecificallydesignedtosatisfytwoseeminglycontradictoryrequirements?fastdeliveryandexpandability.Assumingtheprojectiswellprepared,itshouldnottakemorethanthreeorfourmonthstoimplementafullyoperationalpilotforanenterprisedatawarehouse.Afteritisfinished,acompany-widedatawarehouseplatformwillbeavailableallowinguserstoexecuteconcreteanalysesanddevelopabetterunderstandingoftheirrealandsharedneeds.
Whatisthedifferencebetweenadatawarehousepilotandadepartmentaldatamart?Infact,thesetwoapproachesdiffermoreintheirstrategicgoalsthanintotalexpenditurerequiredforconceptualizationandimplementation.TorunaprojectwithinadepartmentmeansthatyoudonothavetonegotiatewithotherdepartmentsandITmanagers?somethingthatcanprovetime-consuming.Bycontrast,ifyouwanttoestablishacompany-wideproject,youmustcoordinatethiseffortwithotherdepartments,ITmanagementandtopexecutives.
Figure1:Theprocessofcreatingapilotforanenterprise.
Figure1showsapreparatoryphasetostartthedatawarehousepilotproject.Thoroughpreparationswillensurethatthepilotprojectwillnotexceedathreetofourmonthtimeframe.Amongotherthings,theprojectteamwillhavetoclarifytechnicalissuesregardingthesystemandthecontractswiththesystempartnersselected,issuesmostlyarisingoutofthechosendatawarehousearchitecture.Theselectioncriteriaforthecentraldatabasecomputerwilldependupontheamountofdataanticipatednow(andinthepossiblyknownfuture),thenumberandtypesofusersandthecomplexityofthequeries.Fromtheuserspointofview,theselectionoftheanalysistoolisthemostcriticalissue;however,thankstostandardizedinterfaceslikeODBC,itisnotmandatorytostaywithachosentoolforever.Inthebeginning,itissufficienttohaveasuitableOLAPtoolformultidimensionalanalysisandsoftwareprogramsforaccessingthedatawarehousedatabasedirectly.
ProjectDesign
Duringthedesignphase,alltheinformationnecessaryforimplementingthedatawarehousemustbegathered,suchas:
Requirementsofthedepartmentsregardingthepotentialinformationuses;Descriptionofsourcedataused;Definitionofbusinessterms,datadefinitionsandtransformationrules;Datamodelsforthecentraldatawarehouseandthelocaldatamarts.
Simultaneously,thenecessaryhardwareandsoftwaremustbeinstalled.Basically,thedesignphasecanbebrokendownintofoursteps:
Businessquestionsfromdepartments.Inordertoincreasethepilotprojectschanceforsuccess,theselectedbusinessquestionsneedtobeofthegreatestpotentialusefulness.Businessquestionsdonotnecessarilyhavetobestatedasquestions.Existingreportsthatcontainkeyfiguresorconcretesuggestionsastoanalysesnotpossiblebeforecanalsobeused.
Datasourcesavailable.Aftertheusersrequestshavebeenroughlyanalyzed,theITdepartmentmustinvestigatethesourcesystemsandinterfacesavailablewithinthecompany.Duetothetimeconstraintsthepilotprojectfaces,onlydatacanbeconsideredthatisavailableandmeetscertainqualitystandards,suchascompletenessandcorrectcontents.Forasuccessfulpilotinstallation,itisimportanttofocusonthemostimportantrequirements.Therefore,thebusinessquestionsmustbecorrelatedtotheavailabledata.
Businessdatamodel.Thebusinessdatamodelreflectstherealobjectscustomer,order,product,etc.andtheirrelationshipstoeachother.Inordertorepresentthemcorrectlyinthebusinessdatamodel,businessruleshavetobeapplied,suchas"eachorderrelatestoonecustomer,"or"eachcustomercanbelongtovariouscategories."4
Logicaldatamodel.Thelogicaldatamodelinitsnormalizedformisbasedonthebusinessdatamodel,andallobjectsarerepresentedwiththeirattributes.Usually,notallattributesavailablefromthesourcesystemsareneededforansweringthebusinessquestionssubmitted.However,potentiallyusefuldataelementswillbeintegratedsoitwillnotlaterbenecessarytorepeatalltheanalysesperformedforthepilotproject.
Figure2:Allthedataisavailablethroughtheaccesslayer.
Forreasonsofperformanceorbecausethequerytoolrequiresit,denormalizeddatamodelsareneededalongsideanormalizeddatamodel.5Onepossibilityistocomplementthenormalizeddatamodelwithsummarytables.Inanotherapproach,so-called"starschema"or"starmodels"arecreatedinadditiontothenormalizeddatamodel(seeFigure2).Togetherwiththenormalizeddata,theyareavailabletousesthroughviewsonthedatabase.Eachtime,thedatawarehouseisaccessedthroughthesecuritylayer,inwhichalltheaccessauthorizationsarestores.Thenormalizedapproachprovidesamagnitudeofmuchgreatercapabilityandscalabilityinallowingforanyquestiontobeaskedofthedataandalsotoeasilyaddmoredatainthefuturetothedatawarehousedatabase.
Figure2representsthedatamartsaslogicalstructures,i.e.,thenumbersarerecalculatedeachtimetheyarecalledup.Whenstartingthepilotproject,thefirststepshouldbetocreatethedatamartslogically.Onlyifperformanceisreallylacking,willtheybephysicallyimplementedbyusingfacttables,sinceoptimizingperformanceisnothepilotprojectstoppriority.Iftheperformanceisacceptable,itissufficienttobeginanalyzingthedataselected.Morefine-tuningofboththedatabaseandthetoolsutilizedshouldbeaccomplishedAFTERtheusers/managershavebeguntoreporttheirexperiencesandnewvalue.
Anotewithregardtothesetwodifferentdatamodels:thenormalizeddatamodelrepresentsallthebusinessrelationshipsand,therefore,shouldnotbechangedwithoutverygoodreasonsfordoingso.Bycontrast,datamartsareforthemostpartbasedonstarschemamodelsandcontaindataforspecificsubjectareas.Ifanorganizationutilizedstarmodelsandtherearechangingbusinessrequirements,thedatamartshavetoberedesignedand/orre-adjustedtomeetanynewbusinessrequirements.Thiscanbeveryexpensiveandalsolimitthefuturescalabilityandgrowthofyourdatawarehouse.
CheckingtheResults
Thelaststepbeforeadoptingthelogicaldatamodelistocheckitbyusingselectedbusinessqueries.Atypicalbusinessquerymaybe:"Givemeallsalesintheaspecificmonth,brokendownintoindustries(i.e.,hotelsandrestaurantsonly);numberoftransactions;typesofcustomers;andmodeofpayment."
Usingthisquery,systemintegratoranduserscheckthemodeltablebytabletofindpossibleinterpretativeerrorsofthedatamodeler.Experienceshowsthatendusersareverywellabletounderstandalogicaldatamodel,eveniftheyhaveneverseenonebefore.Particularlyfordirectqueriestothedatabase,aprofoundunderstandingofthedatawarehousedatamodelisanecessity.
ImplementationoftheDesign
Afterfinishingthedesignphasewiththesystemcheckoutlined,thedesignwillbeimplementedonthetargetsystemwiththefirstuserscreatinganalysisandreports.Theimplementationphaseconsistsofallstepsnecessarytotransferdatafromtheoperationalsystemsintothedatawarehouse.
Corestepsforsuccessinusingthismethodare:
Thetransformationofthelogicaldatamodelintoaphysicaldatastructureonthetargetsystem;Thecreationofextractionandtransformationprograms;Theimplementationoftherequiredcontrolprocedurestoperiodicallyupdatethedatainthedatawarehouse;Usersdefiningandtestingtheirnewanalysisandreporting.
Apilotprojectforanenterprisedatawarehousewillusuallycontainonlyafewgigabytesofdata,involveoneortwodepartmentsandtwotofoursourcesystems.
Itmayrequiremorecoordinationbetweendepartments,ITmanagersandcompanyexecutivesthandoesaquicklyinstalled,isolateddatamart,buttheseeffortswillreallypayoffoncethepilotdatawarehouseisupandrunning.Yourcompanywillbeusingthisplatformbothforitscurrentinformationalneedsandwithaneyetothefutureasyourbusinessandyourrequirementsexpand.
既能够作为一个单独的应用程序应用在客户端服务器网络环境中,也能够作为一个库而嵌入到其他的软件中。 |
|