仓酷云

 找回密码
 立即注册
搜索
热搜: 活动 交友 discuz
查看: 578|回复: 8
打印 上一主题 下一主题

[学习教程] MSSQL网页设计关于数据堆栈的十个最长问的成绩

[复制链接]
萌萌妈妈 该用户已被删除
跳转到指定楼层
楼主
发表于 2015-1-16 22:24:51 | 只看该作者 回帖奖励 |倒序浏览 |阅读模式

马上注册,结交更多好友,享用更多功能,让你轻松玩转社区。

您需要 登录 才可以下载或查看,没有帐号?立即注册

x
MySQL最初的开发者的意图是用mSQL和他们自己的快速低级例程(ISAM)去连接表格。经过一些测试后,开发者得出结论:mSQL并没有他们需要的那么快和灵活。数据|成绩Althoughtherearevariousapproachestodataminingthatseemtoofferdistinctfeaturesandbenefits,manymaynotbepowerfulenoughtomeetyourcorporateknowledgediscoveryneeds.Butinfactjustafewfundamentalquestionscanquicklyclarifythebusinessbenefitsandthepowerofadataminingsystem,settingitsadvantagesinaclearperspective.Thesequestionsneedtobeaskedbothfromtheviewpointsofbusinessandtechnicalusers.However,pleasenotethatthesequestionsrefertodatamining--pleasealsoseethemanybenefitsoftheknowledgeaccessparadigmwhichusesthepatternsdiscoveredbydataminingwithinaPatternWarehouseTM.Herearetwosetsof"TopTenDataMiningQuestions"frombusinessandtechnicalperspectives.Eachquestionhasthreepartsthattogetherhighlightonespecificaspectofadataminingsystemspowerandcapability.TheTopTenDataMiningBusinessQuestionsThetoptenbusinessquestionshouldbeaskedbybusinessusersaboutthebenefits,qualityandusabilityofthesystem.Theyare:Question1:BusinessBenefitsa)Howwillthissystemhelpus?b)Howwelldoesthissystemworkforourindustry-specificapplications?c)Whatinformationcanwegetthatwedonotalreadyhave?Itisessentialtoaskthisquestionagainandagain.Youshould,ofcourse,getnewrefinedinformation,butitisnotenoughjusttoknowsomething--youshouldhaveinformationthatallowsyouto"act"withinthecontextofyourindustry.And,youshouldmeasurethebottom-linedollarbenefitsdeliveredbyadataminingsystem.Seethepaper"MeasuringtheDollarValuefMinedInformation"foraframeworkforthis.Question2:TechnicalKnow-howa)Howtechnicallysophisticateddoweneedtobetouseit?b)CanbusinessusersoperateitwithoutcallingtheISgroupallthetime?c)Isitaseasytouseasaninternetbrowser?Businessusersshouldbeempoweredwithdirect,on-demandaccesstorefinedknowledge.Theyshouldnothavetoknowstatistics,yetshouldbegivenconsistentandcorrectanswers.Thesysteminterfaceshouldbeaseasytouseasaweb-browser.Question3:UnderstandabilityandExplanationsa)Aretheresultsintuitiveordifficulttounderstand?b)Dowegetclearexplanationsforanyinformationitempresented?c)Willtheexplanationsbeintechnicalstatisticaltermsorinaformthatwecanunderstand?ResultsshouldbepresentedtobusinessusersinplainEnglish,accompaniedwithgraphs.Thesystemshouldbeabletoexplaineachpieceofinformationitpresentsinclear,English-liketermsthatbusinessuserscaneasilycomprehendanduse.Question4:Follow-upQuestionsa)Whatkindsoffollow-upquestionscanweaskfromthesystem?b)Doweneedtogotoananalystforfurtherquestionanswering?c)Howfastcanwedrill-downontheflytoseemorepatterns?Responsetofollow-upquestionsmustbeimmediate.Businessusersshouldnotneedtouseintermediariessuchasanalyststogetmoreinformationaftertheyhaveseensomeresults.Iffollow-upquestionstaketimeandinvolveintermediaries,thebusinessuserseffectivenesswillbeimpacted.Businessusersshouldgetrefinedinformation,astheyneedit,whentheyneedit.Question5:BusinessUsersa)Howmanybusinessuserscanthissystemsupport?b)Canthebusinessuserstailortheirownquestionsforthesystem?c)Canusersutilizetheknowledgeforday-to-daydecisionmaking?Thesystemshouldbeabletousethesamefundamentalknowledgetosupportafewhundredbusinessusers,eachwithadifferentgroup-perspective.Yet,alloftheseusersmustbegivenconsistentanswersastheyasktheirownquestions.Theinformationmustbepresentedsuchthatcanbeutilizedforday-to-dayactions.Question6:Accuracy,CompletenessandConsistencya)Howaccuratearetheresultsthesystemdelivers?b)Cansomepatternsbemissedbythesystem?c)Aretheresultsalwaysconsistentorcan100usersget100differentanswers?Thesystemmustcoverawiderangeofpatternsandshouldprovidehighquality,information.Theknowledgeprovidedtobusinessusersshouldbederivedfromtheentiredataset(andnotsamples)inordertoincreaseaccuracy.Allbusinessusersshouldaccessthesameknowledgesothattheyallreceiveconsistentanswers,increasingthequalityofcorporateinformation.Question7:IncrementalAnalysisa)Canweautomaticallyanalyzeweekly/monthlydataasitbecomesavailable?b)Canthesystemcomparethe"monthtomonth"resultsandpatternsbyitself?c)Canwegetautomaticpatterndetectionovertime,everyweekormonth?Thesystemshouldanalyzedataasitbecomesavailableeveryweekormonthandperformon-goingtrendanalysis,highlightingthekeyitemsandinfluencefactorsthatimpactsignificantchanges.Theincrementalanalysisshouldbeperformedautomaticallyinthebackground,informingtheuserofsignificanttrendsandtheunderlyingcauses.Question8:DataHandlinga)Howmuchdatacanthesystemdealwith?b)Canitworkdirectlyonourdatabase,ordoweneedtoextractdata?c)Ifitworksonextracts,howdoweknowthatsomepatternsarenotmissed?Thesystemshouldhandlemoderatetolargevolumesofdataonapowerfulserver--ofcourse,largedatavolumesshouldnotbeexpectedtobemanagedonsmallservers.ThesystemshouldworkdirectlyontheSQLdatabase,withoutextractssothatpatternsarenotmissedandperformanceisimproved.Question9:Integrationa)Howwillitintegrateintoourcomputingenvironment?b)WillitjustworkonourexistingSQLdatabase?c)Howeasilywillthesystemworkonourintranet?Thesystemshouldrunsmoothlyonexistingopenserverplatforms(e.g.Unix)andpopularDBMSengines(e.g.Oracle,SybaseInformix,etc.)ontheserver.Thesystemshouldpresentresultstousersonthecorporateintranet.Theabsenceofdataconditioningrequirementsandextractfileswillmakeintegrationmucheasier.Question10:SupportStaffa)WhatstaffdoIneedtokeepthissysteminstalledandrunning?b)Howdowegetsupportandtrainingtogetstarted?c)Whathappensafterweinstallthesystem?Aftertheinitialsystemdesign,thesupportpersonnelforthesystemshouldbekeptminimal.OnedatabaseadministratorshouldbeabletomanagetheDBMS,andoneanalystshouldoccasionallyhelpinsettingupdiscoverymodels,etc.Thereafter,businessusersshouldbeabletousethesystemontheirown.Thereshouldbenoneedforalargenumberofresidentsupportanalysttoactasintermediariesforthebusinessusers.TheTopTenDataMiningTechnicalQuestionsThetoptentechnicalquestionshouldbeaskedbytechnicalusersaboutthearchitecture,powerandthescalabilityofthesystem.Theyare:Question1:Architecturea)Howarecomputationsdistributedbetweentheclientandtheserver?b)Isanydatabroughtfromtheservertotheclient?c)Canthesystemruninathreetieredarchitecture?Thebestoptionisforthediscoverytotakeplaceentirelyontheserver.Anyattempttobringdatatotheclientwillseriouslylimittheapplicabilityofthesystemtolargerdatabases.Thebestarchitectureisathin-client,three-tieredsystemthatusesthepowerofalargeserver-basedSQLenginebutoperatesonanintranet.Question2:AccesstoRealDataa)DoesthesystemworkontherealSQLdatabaseoronsamplesandextracts?b)Ifitsamplesorextracts,howdoweknowthatitisaccurate?c)Ifitbuildsflatfiles,whomanagesthisactivityandcleansupforon-goinganalyses,andhowcanitsampleacrossseveraltables?Thebestoptionisforadataminingsystemtoworkontherealdatabasesandnotonsamples,extractsand/orflatfiles.WorkingontherealdatabaseusestheSQLenginespower(e.g.parallelexecution)andprovidemuchmoreaccurateresults.And,thesystemshouldbeabletoaccessdatabasetablesintheirnativeform,reachingacrosstablesbyitself.Question3:PerformanceandScalabilitya)Howlargeofadatabasecanthesystemanalyze?b)Howlongdoesittaketoperformdiscoveryonalargedatabase?c)Canthesystemruninparallelonamulti-processorserver?Thesystemshouldworkondatabaseswithalargenumberofrecords.ItshouldderiveitscapabilitiesfromthepoweroftheserverandtheSQLengine,wheneverpossible.Thesystemshouldbeabletousethebuilt-inparallelismoftheSQLengine,butshouldalsobeabletousemultipleprocessorsforitsownparallelnon-SQLcomputations.Question4:Multi-TableDatabasesa)Doesthesystemworkonasingletableonlyorcanitanalyzemultipletables?b)Doesthesystemneedtoperformahugejointoaccessallofourtables?c)Ifitworksonasingletable,howcanwefeeditourexistingdataschema?Therealworldisfullofmulti-tabledatabaseswhichcannotbejoinedandmeshedintoasingleview.Infact,thetheoryofnormalizationcameaboutbecausedataneedstobeinmorethanonetable.Usingsingletablesisanaffronttoadecadeofworkondatabasedesign.IfyouchallengetheDBAofareallylargedatabasetoputthingsinasingletableyouwilleithergetalaughorablankstare--inmanycasesthedatabasesizewillballoonbeyondcontrol.Thesystemshouldbeabletominelargemulti-tabledatabasesdirectlybyitselfontheserver.Question5:Multi-DimensionalAnalysisa)Doesthesystemanalyzedataalongasingledimensiononly?b)Howaremulti-dimensionalpatternsdiscoveredandexpressedbythesystem?c)Howdowespecifythedimensionalstructureofourdatatothesystem?TheOLAPphenomenonhasconclusivelydemonstratedthatthebusinessworldsdataisnotsingle-dimensional.Henceadataminingsystemshouldbeabletoautomaticallydiscoverpatternsalongmultipledimensions.Infact,therearemanycaseswherenosingledimensionalviewcancorrectlyrepresentthesemanticsofinfluencebecausetheinfluenceratioswillalwaysbeoffregardlessofhowoneaggregates.Seethepaper:OLAP&DataMining:BridgingtheGapforadetaileddiscussionofthis.Question6:TypesandClassesofPatternsDiscovereda)Howpowerfulandgeneralarethepatternsthesystemcandiscoverandexpress?b)Canthesystemmixdifferentpatterntypes,e.g.influenceandaffinitypatterns?c)Canthesystemdiscovertime-basedpatternsandtrends?Theformatofthepatternsdiscoveredbythesystemisverygeneralandgoesfarbeyonddecisiontreesorsimpleaffinities.Theadvantagetothisisthatthegeneralrulesdiscoveredarefarmorepowerfulthandecisiontrees.Decisiontreesareverylimitedinthattheycannotfindalltheinformationinadatabase.Beingrule-basedkeepsthesystemfrombeingconstrainedtoonepartofasearchspaceandmakessurethatmanymoreclustersandpatternsarefound--allowingthesystemtoprovidemoreinformationandbetterpredictions.Question7:SystemInitiativea)Doesthesystemuseitsowninitiativetoperformdiscoveryorisitguidedbytheuser?b)Canthesystemdiscoverunexpectedpatternsbyitself?c)Canthesystemstart-upbyitselfonaweeklyormonthlybasisandperformdiscovery?Insomecasestheuserhastointeractandguidethesystem,e.g.buildadecisiontree.However,abetterapproachisforthesystemtouseitsowninitiativeinthedataminingprocess,forminghypothesisautomaticallybasedonthecharacterofthedata.Thesystemshouldstart-upbyitself,selectthesignificantpatternsinthedataandfiltertheunimportanttrends.Theanalysesshouldbedoneroutinelyonaweeklyormonthlybasis.Question8:TreatmentofDataTypesa)Arealldatatypeshandledintheirownformortranslatedtoothertypes?b)Canthesystemfindnumericrangesindatabyitself?c)Doalargenumberofnon-numericvaluescauseproblemsforthesystem?Thesystemshouldmanagealldatatypesinauniformmannerandintheirnativeformats,i.e.numbers,datesandconstantsshouldremainnumbers,datesandconstantsinternally.Interestingrangesinthedatashouldbediscoveredbythesystem,notrequiring"numberbin"constructionbytheuser.Alargenumberofconstantvaluesinthedatabaseshouldnotchokethesystem.Question9:DataDependenciesandHierarchiesa)Canthesystembetoldaboutthefunctionaldependenciesinourdatabase?b)Doesthesystemunderstandtheconceptofdatahierarchy?c)Howdoesthesystemusedependenciesand/orhierarchiesfordiscovery?Thesystemshouldbecapableofusingthefunctional(andotherdependencies)thatexistinadatabase.Theuseofthesedependenciescansignificantlyenhancethepowerofadiscovery--infactignoringthemcanleadtoconfusion.Thesystemshouldunderstandtheconceptofhierarchyandshouldbeabletouseitfordiscoveryalongmultipledimensions.Question10:FlexibilityandNoiseSensitivitya)Howbrittleisthesystemwhendealingwithnoisydata?b)Howwelldoesthesystemcopewithdataexceptionsandlowqualitydata?c)Canthesystemprovidestatementswithflexiblenumericrangesdiscoveredbyitselfinthedata?Thesystemshouldnotbesensitivetonoiseandshouldinternallyusefuzzylogictosmoothdatabrittleness.Asthedatagathersnoise,thesystemshouldonlyreducethelevelonconfidenceassociatedwiththeresultsprovided,notsuddenlychangedirectionindiscovery.However,thesystemshouldstillproducethemostsignificantfindingsfromthedataset,evenifnoiseispresent.索引是一种特殊的文件(InnoDB数据表上的索引是表空间的一个组成部分),它们包含着对数据表里所有记录的引用指针。索引不是万能的,索引可以加快数据检索操作,但会使数据修改操作变慢。每修改数据记录,索引就必须刷新一次。
沙发
发表于 2015-1-19 11:05:40 | 只看该作者
对递归类的树遍历很有帮助。个人感觉这个真是太棒了!阅读清晰,非常有时代感。
再见西城 该用户已被删除
板凳
发表于 2015-1-24 12:47:51 | 只看该作者
对于数据库来说,查询是数据库的灵魂,那么SQL查询效率究竟效率如何呢?下文将带对SQL查询的相关问题进行讨论,供您参考。
山那边是海 该用户已被删除
地板
发表于 2015-2-1 13:11:40 | 只看该作者
对递归类的树遍历很有帮助。个人感觉这个真是太棒了!阅读清晰,非常有时代感。
简单生活 该用户已被删除
5#
发表于 2015-2-7 06:29:46 | 只看该作者
学习SQL语言的话如果要学会去做网站就不是很难!但是要做数据库管理的话就有难度了!
爱飞 该用户已被删除
6#
发表于 2015-2-20 20:56:30 | 只看该作者
分区表效率问题肯定是大家关心的问题。在我的试验中,如果按照分区字段进行的查询(过滤)效率会高于未分区表的相同语句。但是如果按照非分区字段进行查询,效率会低于未分区表的相同语句。
若相依 该用户已被删除
7#
发表于 2015-3-6 19:10:29 | 只看该作者
这一点很好的加强了profiler的功能。但是提到profiler提醒大家注意一点。windows2003要安装sp1补丁才能启动profiler。否则点击没有反应。
灵魂腐蚀 该用户已被删除
8#
发表于 2015-3-13 06:10:25 | 只看该作者
一个是把SQL语句写到客户端,可以使用DataSet进行加工;
冷月葬花魂 该用户已被删除
9#
发表于 2015-3-20 14:41:30 | 只看该作者
其实可以做一下类比,Oracle等数据库产品老早就支持了java编程,而且提供了java池参数作为用户配置接口。但是现在有哪些系统大批使用了java存储过程?!连Oracle自己的应用都不用为什么?!
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

QQ|Archiver|手机版|仓酷云 鄂ICP备14007578号-2

GMT+8, 2024-12-23 00:24

Powered by Discuz! X3.2

© 2001-2013 Comsenz Inc.

快速回复 返回顶部 返回列表