MSSQL教程之怎样创立一个乐成的数据堆栈(data ware...
用一个库#bak_database存放这些历史数据。创立|数据怎样创立一个乐成的数据堆栈(datawarehose),上面的故事将告知你!
Thecompanysfirstdatawarehouseprojectbeganwithacasualconversationbetweenseveralexecutivesontheirwaytolunch.ThepeopleinvolvedweretheITmanagerfordecisionsupportaswellasseveralmembersofadepartmentthathadjustdecidedtoinstalladatawarehouse.TheyhadplannedtoinstalltheirdatawarehousewithoutanyinvolvementfromtheITdepartment;nonetheless,thefollowingconversationensued:
"Weurgentlyneedadatawarehousetoanalyzeourdata!""Inthatcase,whydontyoutakeanOLAPtoolwithamultidimensionaldatabase?""Isitpossibletomakethesalesfiguresavailabletooursalespeople?""Yes,ofcourse,thatsnoproblembecauseofitsWebcapacity.""Weneedouranswersveryfast.""Performanceisntanissue,thedatarequestedcanbemadeavailableonalocalserver.""Great,whencanwestartouranalyses?""Installingsuchasystemshouldnttakemorethanafewweeks."
Encouragedbythesecasualtipsfromanexpert,thedepartmentdecidedtobuildadatawarehousethatcorrespondedexactlytoitsspecificneeds.Somemonthslater,thedatawarehousewasinstalledaccordingtotheoriginalspecifications.Afterthefirstsuccesseshadbecomepublicknowledgewithinthecompany,otherdepartmentsbegantoshowinterestinthedatawarehouse.Proudly,thesystemwasdisplayed,andenthusiasmwasspreading.Suddenly,eachdepartmentwantedtheirowndatawarehouse,andrequestsbegantopileuponthedesksoftheITmanagers.However,apartfromthecasualconversationpreviously,theITdepartmenthadnotbeeninvolvedinthedevelopmentofthisfirstdatawarehouse.Theprojectitselfhadbeenimplementedbythedepartmentwiththehelpofanexternalsystemintegrator.Nobodyhadplannedonintegratingadditionalusergroups.Ithadbecomeimperativetofurtherdevelopthissuccessfuldatawarehouse.Atthispointitbecameclearthatthedepartmenthadlockeditselfintoadatamartwithonlylimitedscalability,insteadofbuildingadatawarehousewithunlimitedcapacityforexpansion.Thisdifferencebetweendatamartsanddatawarehousesisabasicissuethatthewholecompanynowhadtoface.
WhattodoNext?
Basedonthesituationmentioned,somequestionsarise,suchas:Canadatawarehouseoriginallyconceivedonlyforonedepartmentbeusedforthewholecompany,orshouldnewdatamartsbebuiltforeachdepartment?Ifthelattersolutionispreferred,howwouldonedepartmentaccessdatafromanotherdepartment?Whowillguaranteethatalluserswillreceiveexactlythesameinformation?Thequestionaboutwhethertostartwithdatamartsoradatawarehousehasbeenwidelydiscussed.1,2,3Itcanonlybeansweredbyclearlydefiningtheprojectsgoals:doesithavetocovertheinformationneedsforcertaindepartmentsorisitseenasthefirststeptowardasharedenterpriseinformationpool.Ifonlydepartmentalneedsaretheissue,itwillsufficetoinstallsomeisolateddatamarts.However,ifacompanyregardsaccesstoanintegrated,company-widedatabaseascriticalforitsfuturesurvivalinthemarket,thenanenterprisedatawarehouseisthesolutiontoimplement.
Mythoughts,sofar,mayhavecreatedtheimpressionthatthereareonlytwooptions:eitherquicklyinstallafewdatamartstocoverafewdepartmentscurrentneedsforinformationorembarkontheexpensiveadventureofinstallinganenterprisedatawarehouse.Thereisathirdoptionthatcombinesthebestofbothworldsandcanbeimplementedquicklywithoutsacrificingfuturegrowthoptions.Thisthirdoptionistolaydownthefoundationofanenterprisedatawarehousebystartingwithascalabledatawarehouseframeworkinapilotproject.
HowtoProceed
Theproceduresthatleadtothisscalabledatawarehousepilotprojectarespecificallydesignedtosatisfytwoseeminglycontradictoryrequirements?fastdeliveryandexpandability.Assumingtheprojectiswellprepared,itshouldnottakemorethanthreeorfourmonthstoimplementafullyoperationalpilotforanenterprisedatawarehouse.Afteritisfinished,acompany-widedatawarehouseplatformwillbeavailableallowinguserstoexecuteconcreteanalysesanddevelopabetterunderstandingoftheirrealandsharedneeds.
Whatisthedifferencebetweenadatawarehousepilotandadepartmentaldatamart?Infact,thesetwoapproachesdiffermoreintheirstrategicgoalsthanintotalexpenditurerequiredforconceptualizationandimplementation.TorunaprojectwithinadepartmentmeansthatyoudonothavetonegotiatewithotherdepartmentsandITmanagers?somethingthatcanprovetime-consuming.Bycontrast,ifyouwanttoestablishacompany-wideproject,youmustcoordinatethiseffortwithotherdepartments,ITmanagementandtopexecutives.
Figure1:Theprocessofcreatingapilotforanenterprise.
Figure1showsapreparatoryphasetostartthedatawarehousepilotproject.Thoroughpreparationswillensurethatthepilotprojectwillnotexceedathreetofourmonthtimeframe.Amongotherthings,theprojectteamwillhavetoclarifytechnicalissuesregardingthesystemandthecontractswiththesystempartnersselected,issuesmostlyarisingoutofthechosendatawarehousearchitecture.Theselectioncriteriaforthecentraldatabasecomputerwilldependupontheamountofdataanticipatednow(andinthepossiblyknownfuture),thenumberandtypesofusersandthecomplexityofthequeries.Fromtheuserspointofview,theselectionoftheanalysistoolisthemostcriticalissue;however,thankstostandardizedinterfaceslikeODBC,itisnotmandatorytostaywithachosentoolforever.Inthebeginning,itissufficienttohaveasuitableOLAPtoolformultidimensionalanalysisandsoftwareprogramsforaccessingthedatawarehousedatabasedirectly.
ProjectDesign
Duringthedesignphase,alltheinformationnecessaryforimplementingthedatawarehousemustbegathered,suchas:
Requirementsofthedepartmentsregardingthepotentialinformationuses;Descriptionofsourcedataused;Definitionofbusinessterms,datadefinitionsandtransformationrules;Datamodelsforthecentraldatawarehouseandthelocaldatamarts.
Simultaneously,thenecessaryhardwareandsoftwaremustbeinstalled.Basically,thedesignphasecanbebrokendownintofoursteps:
Businessquestionsfromdepartments.Inordertoincreasethepilotprojectschanceforsuccess,theselectedbusinessquestionsneedtobeofthegreatestpotentialusefulness.Businessquestionsdonotnecessarilyhavetobestatedasquestions.Existingreportsthatcontainkeyfiguresorconcretesuggestionsastoanalysesnotpossiblebeforecanalsobeused.
Datasourcesavailable.Aftertheusersrequestshavebeenroughlyanalyzed,theITdepartmentmustinvestigatethesourcesystemsandinterfacesavailablewithinthecompany.Duetothetimeconstraintsthepilotprojectfaces,onlydatacanbeconsideredthatisavailableandmeetscertainqualitystandards,suchascompletenessandcorrectcontents.Forasuccessfulpilotinstallation,itisimportanttofocusonthemostimportantrequirements.Therefore,thebusinessquestionsmustbecorrelatedtotheavailabledata.
Businessdatamodel.Thebusinessdatamodelreflectstherealobjectscustomer,order,product,etc.andtheirrelationshipstoeachother.Inordertorepresentthemcorrectlyinthebusinessdatamodel,businessruleshavetobeapplied,suchas"eachorderrelatestoonecustomer,"or"eachcustomercanbelongtovariouscategories."4
Logicaldatamodel.Thelogicaldatamodelinitsnormalizedformisbasedonthebusinessdatamodel,andallobjectsarerepresentedwiththeirattributes.Usually,notallattributesavailablefromthesourcesystemsareneededforansweringthebusinessquestionssubmitted.However,potentiallyusefuldataelementswillbeintegratedsoitwillnotlaterbenecessarytorepeatalltheanalysesperformedforthepilotproject.
Figure2:Allthedataisavailablethroughtheaccesslayer.
Forreasonsofperformanceorbecausethequerytoolrequiresit,denormalizeddatamodelsareneededalongsideanormalizeddatamodel.5Onepossibilityistocomplementthenormalizeddatamodelwithsummarytables.Inanotherapproach,so-called"starschema"or"starmodels"arecreatedinadditiontothenormalizeddatamodel(seeFigure2).Togetherwiththenormalizeddata,theyareavailabletousesthroughviewsonthedatabase.Eachtime,thedatawarehouseisaccessedthroughthesecuritylayer,inwhichalltheaccessauthorizationsarestores.Thenormalizedapproachprovidesamagnitudeofmuchgreatercapabilityandscalabilityinallowingforanyquestiontobeaskedofthedataandalsotoeasilyaddmoredatainthefuturetothedatawarehousedatabase.
Figure2representsthedatamartsaslogicalstructures,i.e.,thenumbersarerecalculatedeachtimetheyarecalledup.Whenstartingthepilotproject,thefirststepshouldbetocreatethedatamartslogically.Onlyifperformanceisreallylacking,willtheybephysicallyimplementedbyusingfacttables,sinceoptimizingperformanceisnothepilotprojectstoppriority.Iftheperformanceisacceptable,itissufficienttobeginanalyzingthedataselected.Morefine-tuningofboththedatabaseandthetoolsutilizedshouldbeaccomplishedAFTERtheusers/managershavebeguntoreporttheirexperiencesandnewvalue.
Anotewithregardtothesetwodifferentdatamodels:thenormalizeddatamodelrepresentsallthebusinessrelationshipsand,therefore,shouldnotbechangedwithoutverygoodreasonsfordoingso.Bycontrast,datamartsareforthemostpartbasedonstarschemamodelsandcontaindataforspecificsubjectareas.Ifanorganizationutilizedstarmodelsandtherearechangingbusinessrequirements,thedatamartshavetoberedesignedand/orre-adjustedtomeetanynewbusinessrequirements.Thiscanbeveryexpensiveandalsolimitthefuturescalabilityandgrowthofyourdatawarehouse.
CheckingtheResults
Thelaststepbeforeadoptingthelogicaldatamodelistocheckitbyusingselectedbusinessqueries.Atypicalbusinessquerymaybe:"Givemeallsalesintheaspecificmonth,brokendownintoindustries(i.e.,hotelsandrestaurantsonly);numberoftransactions;typesofcustomers;andmodeofpayment."
Usingthisquery,systemintegratoranduserscheckthemodeltablebytabletofindpossibleinterpretativeerrorsofthedatamodeler.Experienceshowsthatendusersareverywellabletounderstandalogicaldatamodel,eveniftheyhaveneverseenonebefore.Particularlyfordirectqueriestothedatabase,aprofoundunderstandingofthedatawarehousedatamodelisanecessity.
ImplementationoftheDesign
Afterfinishingthedesignphasewiththesystemcheckoutlined,thedesignwillbeimplementedonthetargetsystemwiththefirstuserscreatinganalysisandreports.Theimplementationphaseconsistsofallstepsnecessarytotransferdatafromtheoperationalsystemsintothedatawarehouse.
Corestepsforsuccessinusingthismethodare:
Thetransformationofthelogicaldatamodelintoaphysicaldatastructureonthetargetsystem;Thecreationofextractionandtransformationprograms;Theimplementationoftherequiredcontrolprocedurestoperiodicallyupdatethedatainthedatawarehouse;Usersdefiningandtestingtheirnewanalysisandreporting.
Apilotprojectforanenterprisedatawarehousewillusuallycontainonlyafewgigabytesofdata,involveoneortwodepartmentsandtwotofoursourcesystems.
Itmayrequiremorecoordinationbetweendepartments,ITmanagersandcompanyexecutivesthandoesaquicklyinstalled,isolateddatamart,buttheseeffortswillreallypayoffoncethepilotdatawarehouseisupandrunning.Yourcompanywillbeusingthisplatformbothforitscurrentinformationalneedsandwithaneyetothefutureasyourbusinessandyourrequirementsexpand.
既能够作为一个单独的应用程序应用在客户端服务器网络环境中,也能够作为一个库而嵌入到其他的软件中。 备份方面可能还是一个老大难的问题。不能单独备份几个表总是感觉不爽。灵活备份的问题不知道什么时候才能解决。 无法深入到数据库系统层面去了解和探究 总感觉自己还是不会SQL 然后最好有实践机会,能够把实践到的和实践结合起来,其实理论思考是个非常困扰和痛苦的事情 比如日志传送、比如集群。。。 如果是将来做数据库的开发设计,就应该详细学习T-SQL的各种细节,包括T-SQL的程序设计、存储过程、触发器以及具体使用某个开发语言来访问数据库。
页:
[1]