仓酷云

标题: MSSQL网页设计关于数据堆栈的十个最长问的成绩 [打印本页]

作者: 萌萌妈妈    时间: 2015-1-16 22:24
标题: MSSQL网页设计关于数据堆栈的十个最长问的成绩
MySQL最初的开发者的意图是用mSQL和他们自己的快速低级例程(ISAM)去连接表格。经过一些测试后,开发者得出结论:mSQL并没有他们需要的那么快和灵活。数据|成绩Althoughtherearevariousapproachestodataminingthatseemtoofferdistinctfeaturesandbenefits,manymaynotbepowerfulenoughtomeetyourcorporateknowledgediscoveryneeds.Butinfactjustafewfundamentalquestionscanquicklyclarifythebusinessbenefitsandthepowerofadataminingsystem,settingitsadvantagesinaclearperspective.Thesequestionsneedtobeaskedbothfromtheviewpointsofbusinessandtechnicalusers.However,pleasenotethatthesequestionsrefertodatamining--pleasealsoseethemanybenefitsoftheknowledgeaccessparadigmwhichusesthepatternsdiscoveredbydataminingwithinaPatternWarehouseTM.Herearetwosetsof"TopTenDataMiningQuestions"frombusinessandtechnicalperspectives.Eachquestionhasthreepartsthattogetherhighlightonespecificaspectofadataminingsystemspowerandcapability.TheTopTenDataMiningBusinessQuestionsThetoptenbusinessquestionshouldbeaskedbybusinessusersaboutthebenefits,qualityandusabilityofthesystem.Theyare:Question1:BusinessBenefitsa)Howwillthissystemhelpus?b)Howwelldoesthissystemworkforourindustry-specificapplications?c)Whatinformationcanwegetthatwedonotalreadyhave?Itisessentialtoaskthisquestionagainandagain.Youshould,ofcourse,getnewrefinedinformation,butitisnotenoughjusttoknowsomething--youshouldhaveinformationthatallowsyouto"act"withinthecontextofyourindustry.And,youshouldmeasurethebottom-linedollarbenefitsdeliveredbyadataminingsystem.Seethepaper"MeasuringtheDollarValuefMinedInformation"foraframeworkforthis.Question2:TechnicalKnow-howa)Howtechnicallysophisticateddoweneedtobetouseit?b)CanbusinessusersoperateitwithoutcallingtheISgroupallthetime?c)Isitaseasytouseasaninternetbrowser?Businessusersshouldbeempoweredwithdirect,on-demandaccesstorefinedknowledge.Theyshouldnothavetoknowstatistics,yetshouldbegivenconsistentandcorrectanswers.Thesysteminterfaceshouldbeaseasytouseasaweb-browser.Question3:UnderstandabilityandExplanationsa)Aretheresultsintuitiveordifficulttounderstand?b)Dowegetclearexplanationsforanyinformationitempresented?c)Willtheexplanationsbeintechnicalstatisticaltermsorinaformthatwecanunderstand?ResultsshouldbepresentedtobusinessusersinplainEnglish,accompaniedwithgraphs.Thesystemshouldbeabletoexplaineachpieceofinformationitpresentsinclear,English-liketermsthatbusinessuserscaneasilycomprehendanduse.Question4:Follow-upQuestionsa)Whatkindsoffollow-upquestionscanweaskfromthesystem?b)Doweneedtogotoananalystforfurtherquestionanswering?c)Howfastcanwedrill-downontheflytoseemorepatterns?Responsetofollow-upquestionsmustbeimmediate.Businessusersshouldnotneedtouseintermediariessuchasanalyststogetmoreinformationaftertheyhaveseensomeresults.Iffollow-upquestionstaketimeandinvolveintermediaries,thebusinessuserseffectivenesswillbeimpacted.Businessusersshouldgetrefinedinformation,astheyneedit,whentheyneedit.Question5:BusinessUsersa)Howmanybusinessuserscanthissystemsupport?b)Canthebusinessuserstailortheirownquestionsforthesystem?c)Canusersutilizetheknowledgeforday-to-daydecisionmaking?Thesystemshouldbeabletousethesamefundamentalknowledgetosupportafewhundredbusinessusers,eachwithadifferentgroup-perspective.Yet,alloftheseusersmustbegivenconsistentanswersastheyasktheirownquestions.Theinformationmustbepresentedsuchthatcanbeutilizedforday-to-dayactions.Question6:Accuracy,CompletenessandConsistencya)Howaccuratearetheresultsthesystemdelivers?b)Cansomepatternsbemissedbythesystem?c)Aretheresultsalwaysconsistentorcan100usersget100differentanswers?Thesystemmustcoverawiderangeofpatternsandshouldprovidehighquality,information.Theknowledgeprovidedtobusinessusersshouldbederivedfromtheentiredataset(andnotsamples)inordertoincreaseaccuracy.Allbusinessusersshouldaccessthesameknowledgesothattheyallreceiveconsistentanswers,increasingthequalityofcorporateinformation.Question7:IncrementalAnalysisa)Canweautomaticallyanalyzeweekly/monthlydataasitbecomesavailable?b)Canthesystemcomparethe"monthtomonth"resultsandpatternsbyitself?c)Canwegetautomaticpatterndetectionovertime,everyweekormonth?Thesystemshouldanalyzedataasitbecomesavailableeveryweekormonthandperformon-goingtrendanalysis,highlightingthekeyitemsandinfluencefactorsthatimpactsignificantchanges.Theincrementalanalysisshouldbeperformedautomaticallyinthebackground,informingtheuserofsignificanttrendsandtheunderlyingcauses.Question8:DataHandlinga)Howmuchdatacanthesystemdealwith?b)Canitworkdirectlyonourdatabase,ordoweneedtoextractdata?c)Ifitworksonextracts,howdoweknowthatsomepatternsarenotmissed?Thesystemshouldhandlemoderatetolargevolumesofdataonapowerfulserver--ofcourse,largedatavolumesshouldnotbeexpectedtobemanagedonsmallservers.ThesystemshouldworkdirectlyontheSQLdatabase,withoutextractssothatpatternsarenotmissedandperformanceisimproved.Question9:Integrationa)Howwillitintegrateintoourcomputingenvironment?b)WillitjustworkonourexistingSQLdatabase?c)Howeasilywillthesystemworkonourintranet?Thesystemshouldrunsmoothlyonexistingopenserverplatforms(e.g.Unix)andpopularDBMSengines(e.g.Oracle,SybaseInformix,etc.)ontheserver.Thesystemshouldpresentresultstousersonthecorporateintranet.Theabsenceofdataconditioningrequirementsandextractfileswillmakeintegrationmucheasier.Question10:SupportStaffa)WhatstaffdoIneedtokeepthissysteminstalledandrunning?b)Howdowegetsupportandtrainingtogetstarted?c)Whathappensafterweinstallthesystem?Aftertheinitialsystemdesign,thesupportpersonnelforthesystemshouldbekeptminimal.OnedatabaseadministratorshouldbeabletomanagetheDBMS,andoneanalystshouldoccasionallyhelpinsettingupdiscoverymodels,etc.Thereafter,businessusersshouldbeabletousethesystemontheirown.Thereshouldbenoneedforalargenumberofresidentsupportanalysttoactasintermediariesforthebusinessusers.TheTopTenDataMiningTechnicalQuestionsThetoptentechnicalquestionshouldbeaskedbytechnicalusersaboutthearchitecture,powerandthescalabilityofthesystem.Theyare:Question1:Architecturea)Howarecomputationsdistributedbetweentheclientandtheserver?b)Isanydatabroughtfromtheservertotheclient?c)Canthesystemruninathreetieredarchitecture?Thebestoptionisforthediscoverytotakeplaceentirelyontheserver.Anyattempttobringdatatotheclientwillseriouslylimittheapplicabilityofthesystemtolargerdatabases.Thebestarchitectureisathin-client,three-tieredsystemthatusesthepowerofalargeserver-basedSQLenginebutoperatesonanintranet.Question2:AccesstoRealDataa)DoesthesystemworkontherealSQLdatabaseoronsamplesandextracts?b)Ifitsamplesorextracts,howdoweknowthatitisaccurate?c)Ifitbuildsflatfiles,whomanagesthisactivityandcleansupforon-goinganalyses,andhowcanitsampleacrossseveraltables?Thebestoptionisforadataminingsystemtoworkontherealdatabasesandnotonsamples,extractsand/orflatfiles.WorkingontherealdatabaseusestheSQLenginespower(e.g.parallelexecution)andprovidemuchmoreaccurateresults.And,thesystemshouldbeabletoaccessdatabasetablesintheirnativeform,reachingacrosstablesbyitself.Question3:PerformanceandScalabilitya)Howlargeofadatabasecanthesystemanalyze?b)Howlongdoesittaketoperformdiscoveryonalargedatabase?c)Canthesystemruninparallelonamulti-processorserver?Thesystemshouldworkondatabaseswithalargenumberofrecords.ItshouldderiveitscapabilitiesfromthepoweroftheserverandtheSQLengine,wheneverpossible.Thesystemshouldbeabletousethebuilt-inparallelismoftheSQLengine,butshouldalsobeabletousemultipleprocessorsforitsownparallelnon-SQLcomputations.Question4:Multi-TableDatabasesa)Doesthesystemworkonasingletableonlyorcanitanalyzemultipletables?b)Doesthesystemneedtoperformahugejointoaccessallofourtables?c)Ifitworksonasingletable,howcanwefeeditourexistingdataschema?Therealworldisfullofmulti-tabledatabaseswhichcannotbejoinedandmeshedintoasingleview.Infact,thetheoryofnormalizationcameaboutbecausedataneedstobeinmorethanonetable.Usingsingletablesisanaffronttoadecadeofworkondatabasedesign.IfyouchallengetheDBAofareallylargedatabasetoputthingsinasingletableyouwilleithergetalaughorablankstare--inmanycasesthedatabasesizewillballoonbeyondcontrol.Thesystemshouldbeabletominelargemulti-tabledatabasesdirectlybyitselfontheserver.Question5:Multi-DimensionalAnalysisa)Doesthesystemanalyzedataalongasingledimensiononly?b)Howaremulti-dimensionalpatternsdiscoveredandexpressedbythesystem?c)Howdowespecifythedimensionalstructureofourdatatothesystem?TheOLAPphenomenonhasconclusivelydemonstratedthatthebusinessworldsdataisnotsingle-dimensional.Henceadataminingsystemshouldbeabletoautomaticallydiscoverpatternsalongmultipledimensions.Infact,therearemanycaseswherenosingledimensionalviewcancorrectlyrepresentthesemanticsofinfluencebecausetheinfluenceratioswillalwaysbeoffregardlessofhowoneaggregates.Seethepaper:OLAP&DataMining:BridgingtheGapforadetaileddiscussionofthis.Question6:TypesandClassesofPatternsDiscovereda)Howpowerfulandgeneralarethepatternsthesystemcandiscoverandexpress?b)Canthesystemmixdifferentpatterntypes,e.g.influenceandaffinitypatterns?c)Canthesystemdiscovertime-basedpatternsandtrends?Theformatofthepatternsdiscoveredbythesystemisverygeneralandgoesfarbeyonddecisiontreesorsimpleaffinities.Theadvantagetothisisthatthegeneralrulesdiscoveredarefarmorepowerfulthandecisiontrees.Decisiontreesareverylimitedinthattheycannotfindalltheinformationinadatabase.Beingrule-basedkeepsthesystemfrombeingconstrainedtoonepartofasearchspaceandmakessurethatmanymoreclustersandpatternsarefound--allowingthesystemtoprovidemoreinformationandbetterpredictions.Question7:SystemInitiativea)Doesthesystemuseitsowninitiativetoperformdiscoveryorisitguidedbytheuser?b)Canthesystemdiscoverunexpectedpatternsbyitself?c)Canthesystemstart-upbyitselfonaweeklyormonthlybasisandperformdiscovery?Insomecasestheuserhastointeractandguidethesystem,e.g.buildadecisiontree.However,abetterapproachisforthesystemtouseitsowninitiativeinthedataminingprocess,forminghypothesisautomaticallybasedonthecharacterofthedata.Thesystemshouldstart-upbyitself,selectthesignificantpatternsinthedataandfiltertheunimportanttrends.Theanalysesshouldbedoneroutinelyonaweeklyormonthlybasis.Question8:TreatmentofDataTypesa)Arealldatatypeshandledintheirownformortranslatedtoothertypes?b)Canthesystemfindnumericrangesindatabyitself?c)Doalargenumberofnon-numericvaluescauseproblemsforthesystem?Thesystemshouldmanagealldatatypesinauniformmannerandintheirnativeformats,i.e.numbers,datesandconstantsshouldremainnumbers,datesandconstantsinternally.Interestingrangesinthedatashouldbediscoveredbythesystem,notrequiring"numberbin"constructionbytheuser.Alargenumberofconstantvaluesinthedatabaseshouldnotchokethesystem.Question9:DataDependenciesandHierarchiesa)Canthesystembetoldaboutthefunctionaldependenciesinourdatabase?b)Doesthesystemunderstandtheconceptofdatahierarchy?c)Howdoesthesystemusedependenciesand/orhierarchiesfordiscovery?Thesystemshouldbecapableofusingthefunctional(andotherdependencies)thatexistinadatabase.Theuseofthesedependenciescansignificantlyenhancethepowerofadiscovery--infactignoringthemcanleadtoconfusion.Thesystemshouldunderstandtheconceptofhierarchyandshouldbeabletouseitfordiscoveryalongmultipledimensions.Question10:FlexibilityandNoiseSensitivitya)Howbrittleisthesystemwhendealingwithnoisydata?b)Howwelldoesthesystemcopewithdataexceptionsandlowqualitydata?c)Canthesystemprovidestatementswithflexiblenumericrangesdiscoveredbyitselfinthedata?Thesystemshouldnotbesensitivetonoiseandshouldinternallyusefuzzylogictosmoothdatabrittleness.Asthedatagathersnoise,thesystemshouldonlyreducethelevelonconfidenceassociatedwiththeresultsprovided,notsuddenlychangedirectionindiscovery.However,thesystemshouldstillproducethemostsignificantfindingsfromthedataset,evenifnoiseispresent.索引是一种特殊的文件(InnoDB数据表上的索引是表空间的一个组成部分),它们包含着对数据表里所有记录的引用指针。索引不是万能的,索引可以加快数据检索操作,但会使数据修改操作变慢。每修改数据记录,索引就必须刷新一次。
作者: 仓酷云    时间: 2015-1-19 11:05
对递归类的树遍历很有帮助。个人感觉这个真是太棒了!阅读清晰,非常有时代感。
作者: 再见西城    时间: 2015-1-24 12:47
对于数据库来说,查询是数据库的灵魂,那么SQL查询效率究竟效率如何呢?下文将带对SQL查询的相关问题进行讨论,供您参考。
作者: 山那边是海    时间: 2015-2-1 13:11
对递归类的树遍历很有帮助。个人感觉这个真是太棒了!阅读清晰,非常有时代感。
作者: 简单生活    时间: 2015-2-7 06:29
学习SQL语言的话如果要学会去做网站就不是很难!但是要做数据库管理的话就有难度了!
作者: 爱飞    时间: 2015-2-20 20:56
分区表效率问题肯定是大家关心的问题。在我的试验中,如果按照分区字段进行的查询(过滤)效率会高于未分区表的相同语句。但是如果按照非分区字段进行查询,效率会低于未分区表的相同语句。
作者: 若相依    时间: 2015-3-6 19:10
这一点很好的加强了profiler的功能。但是提到profiler提醒大家注意一点。windows2003要安装sp1补丁才能启动profiler。否则点击没有反应。
作者: 灵魂腐蚀    时间: 2015-3-13 06:10
一个是把SQL语句写到客户端,可以使用DataSet进行加工;
作者: 冷月葬花魂    时间: 2015-3-20 14:41
其实可以做一下类比,Oracle等数据库产品老早就支持了java编程,而且提供了java池参数作为用户配置接口。但是现在有哪些系统大批使用了java存储过程?!连Oracle自己的应用都不用为什么?!




欢迎光临 仓酷云 (http://ckuyun.com/) Powered by Discuz! X3.2