|
database
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
tips for db programmingHi guys,
do you have any tips to share about the following works/environment I'm involving right now? 1. performance tuning 2. high volume transaction processing 3. write & lock tuning 4. large databases This is really a good start.
http://www.sql-server-performance.com/ AMB Show quote "Britney" wrote: > Hi guys, > > do you have any tips to share about the following works/environment I'm > involving right now? > > 1. performance tuning > > 2. high volume transaction processing > > 3. write & lock tuning > > 4. large databases > > > > > > The biggest bottleneck in any system is the disk subsystem. Make it as fast
as possible. Use a separate disk array for tempdb. Use a separate disk array for transaction logs. Don't use RAID-5, use RAID 0+1. RAID-5 requires a read of all disks and a write to two disks for every write. I've achieved higher performance in disk subsystems by using only part of each disk. If the full-stroke seek time of a disk is 11ms, and the track-track seek time is 1ms, then the average seek time is 6ms. If you limit the system to only use a fraction of the disk, you can significantly increase performance. If your partition is only 1/2 of the available space for a disk, you can reduce the average seek time from 6ms down to 3.5ms. If you only use 1/4 of the available space, you can reduce the average seek time from 6ms down to 2.25ms. Further spreading the data across four disks, and the average seek time for the entire array is reduced to 0.5625ms. Doubling that (RAID 0+1) for reads, gives you 0.28125ms per seek. Use high-end disks 15K RPM, since that reduces latency. Remember, the total time required to read information from a disk = SEEK + LATENCY + TRANSFER. Some RAID controllers read an entire track at a time, so a higher RPM means less time required to do that. Examine each table and how it's used. Separate the most volitile onto their own disk subsystem. (You may want to put them into their own database, and separate the database and log files onto their own separate subsystems.) Use the lowest transaction isolation level needed to maintain integrity. Make sure transactions are as short as possible. Use optimistic concurrency--in other words, read the information needed to process a transaction WITH(NOLOCK), perform any calculations and transformations--caching the results, lock the affected rows--checking to see if any rows had been changed by another process, and finally write out the results. If any rows were changed by another process (a collision), release the locks (rollback the transaction) and restart the process. The frequency of collisions is usually low, so the additional overhead caused by collisions does not usually impact performance significantly. Show quote "Britney" <britneychen_2***@yahoo.com> wrote in message news:OWyWDUOfFHA.1148@TK2MSFTNGP12.phx.gbl... > Hi guys, > > do you have any tips to share about the following works/environment I'm > involving right now? > > 1. performance tuning > > 2. high volume transaction processing > > 3. write & lock tuning > > 4. large databases > > > > > Hi Britney,
I agree with the other posts here, and would add that Not all performance strategies perform well or uniformly because there's an inherent hierarchy or dependency between performance strategies that's often overlooked. 5. Server tuning, adding memory, and improving disk i/o all make a significant difference. In fact, many managers respond to poor performance by blindly throwing money at hardware. However, running slow software faster is still just running slow software and even the best hardware will only "wait faster" for a blocked resource. 4. Locking and blocking is more common a problem than most developers think, and it's an issue that needs to be addressed and solved. But reducing blocked resources won't overcome an unnecessary table scan. 3. Indexing is the performance bridge between queries and data and a key performance strategy. An index can make the difference between an instant response vs. a hour wait. But well-designed indexes can't overcome non-scalable iterative code. 2. Set-based solution. SQL is a set-based system and iterative row-by-row operations actually function as zillions of small single row sets. Whether the iterations take place as server-side SQL cursors, or ADO loops through a record set, iterative code is costly. My number two performance strategy is to use set-based solutions. But good set-based code can't overcome a clumsy or overly complex schema design. 1. Schema Design is my number one performance strategy. It's far easier to write clean set-based queries, tune the indexes, reduce locking, and fine-tune a database with the simpler, generalized, and normalized data schema than it is to optimize a database with an overly complex schema. I believe the purist logical data modeler is the number one performance problem in our industry because of the cascading problems caused by his burdensome designs. Take the time to be absolutely sure you have the simplest possible schema that meets the requirements. Remember that normalization is only a tool, not the whole point of the database. So build a normalized schema but don't stop at normalization as if it's a waterfall, one-right-answer, type of technique. It's not. Normalization is like the grammar rules of writing. Use them but be creative until the schema is easy, obvious, and gets the job done using the fewest possible number of tables that meets the requirements. I'm NOT advocating denormlization. Depending on how you view the entities, or generalize entities, there are often numerous ways to design a schema. Your job is to find those ways and compare them for query path, flexibility , and understandability. Often I see entities that I would have generalized together being broken apart into multiple sets of tables because they are not "exactly" the same thing. This can inflate a schema that could have been 20 tables into a monster with 150 tables very easily. If you don't belabor the schema design you'll end up with a schema that makes the code do extra work, is complex to query, introduces data integrity errors, or even encourages the use of cursors to work with the data. btw, I'm leaving for vacation for a week, be back later. Show quote "Britney" <britneychen_2***@yahoo.com> wrote in message news:OWyWDUOfFHA.1148@TK2MSFTNGP12.phx.gbl... > Hi guys, > > do you have any tips to share about the following works/environment I'm > involving right now? > > 1. performance tuning > > 2. high volume transaction processing > > 3. write & lock tuning > > 4. large databases > > > > > |
|||||||||||||||||||||||