tag:blogger.com,1999:blog-2954359812249072053.post5729971738040928324..comments2023-12-15T23:33:59.034-06:00Comments on Cary Millsap: Dang it, people, they're syscalls, not "waits"...Cary Millsaphttp://www.blogger.com/profile/16697498718050285274noreply@blogger.comBlogger16125tag:blogger.com,1999:blog-2954359812249072053.post-51407124217498715662020-01-20T16:10:43.103-06:002020-01-20T16:10:43.103-06:00Kunwar,
If you’ll strace an Oracle kernel process...Kunwar,<br /><br />If you’ll <i>strace</i> an Oracle kernel process when it is waiting on a row lock, you’ll see a sequence of <i>enqueue</i> events in the Oracle trace file, and a sequence of associated operating system calls in the strace output. The OS call I’ve seen related to <i>enqueue</i> events is <i>semtimedop</i>. If you look at the manual page for <i>semtimedop</i>, I think it will make sense to you what is going on.Cary Millsaphttps://www.blogger.com/profile/16697498718050285274noreply@blogger.comtag:blogger.com,1999:blog-2954359812249072053.post-75493884587913367472020-01-20T08:08:31.001-06:002020-01-20T08:08:31.001-06:00Cary , one question.
Is "enq: TX - row lock ...Cary , one question. <br />Is "enq: TX - row lock contention" also not a wait but a syscall ? I thought row lock contention is pure database wait event unlike "db file sequential read"<br /><br />Rgds,<br />KunwarKunwarSinghhttps://www.blogger.com/profile/07105648397465189361noreply@blogger.comtag:blogger.com,1999:blog-2954359812249072053.post-43096098319407206572015-02-04T14:00:52.513-06:002015-02-04T14:00:52.513-06:00Karthik,
Thank you. Yes, that's exactly what ...Karthik,<br /><br />Thank you. Yes, that's <i>exactly</i> what I'm saying. In Oracle, response time <i>is</i> the sum of the time spent executing database calls, plus the time spent executing system calls (which Oracle people, unfortunately, call "waits"), with some double-counting correction for recursive calls. In this context, it’s fine to say that “R=S+W”, but it’s weird that we don’t say ”R=D+S”, where D stands for “database calls” and S stands for ”system calls.” I guess if we did that, people would incorrectly imagine that the “S” for syscalls stood for “service time.” :-)<br /><br />The response time of a system call executed by an Oracle database kernel process is, in the queueing model R=S+Q, an R (a response time), not a Q (a queueing delay). A syscall is simply not a type of queueing delay for an Oracle kernel process.<br /><br />Imagine, for example, a “db file sequential read” that maps to a <i>pread</i> OS call. Within that single <i>pread</i> call, there's code path executed in OS kernel mode, which itself can be subject to queueing delay (if the CPU it needs is already busy), followed <i>possibly</i> by a queueing delay at the physical device before the actual seek and rotational latencies kick in (part of your S, the service time for the call), and this kind of behavior happens all the way through the data transfer until the call returns, at which point the Oracle kernel itself may have to queue for CPU in the OS “ready to run” state. The more detail you investigate, the more opportunities for correct application of R=S+Q that you’ll find.<br /><br />Queueing theory—specifically the M/M/<i>m</i> model—helps you predict Q (queueing delay) based on a system’s configuration and its load characteristics. However, if you want your Q to represent the duration of a syscall (e.g., of a “db file sequential read”), the formulas will not work for you. That's my main point. It's because Oracle ”waits” (unfortunate choice of terminology; actually they’re syscalls) are not queueing delays.<br /><br />However, you might use M/M/<i>m</i> to predict the expected duration of a “db file sequential read” call if you know the architecture of the device servicing your reads, and if you know the distribution of read calls that are hitting that device.Cary Millsaphttps://www.blogger.com/profile/16697498718050285274noreply@blogger.comtag:blogger.com,1999:blog-2954359812249072053.post-76039066711727662742015-02-04T03:23:24.436-06:002015-02-04T03:23:24.436-06:00Excellent article.Do you mean to say one should no...Excellent article.Do you mean to say one should not confuse classic queuing theory of R=S+Q with that of R=S+w used for oracle response time analysis? Also in which perspective(Oracle,OS and application) the classic queuing theory goes well? I mean where we can apply this equation directly?Anonymoushttps://www.blogger.com/profile/17748301266186913839noreply@blogger.comtag:blogger.com,1999:blog-2954359812249072053.post-29391025308632339952009-03-07T08:06:00.000-06:002009-03-07T08:06:00.000-06:00Cary, So until know we have very similar point of ...Cary, <BR/><BR/>So until know we have very similar point of view on Oracle DB. I have to agree with you in QT modeling too.<BR/>I have delivered two project based on queuing theory - one like you on high level and this one was most accurate. A second one was calculated on low level model (using oracle syscall(waits), cpu measuring, etc) - the results was quite OK, but every even small independent change had a big bad impact on that results. <BR/>If you are interested in modeling I can recommend you that book - <BR/>R. Jain, "The Art of Computer Systems Performance Analysis: Techniques for Experimental Design, Measurement, Simulation, and Modeling," April 1991, ISBN:0471503361 - is a old one but describes many modeling theory and I have used it many times (even in my MA thesis about modeling ATM networks).<BR/><BR/>regards,<BR/>MarcinMarcin Przepiorowskihttps://www.blogger.com/profile/15133397892511680504noreply@blogger.comtag:blogger.com,1999:blog-2954359812249072053.post-60765468897199559662009-03-03T12:15:00.000-06:002009-03-03T12:15:00.000-06:00"Every time control passes from one tier to a deep..."Every time control passes from one tier to a deeper one (Oracle to OS, OS to device), there's a new R being instantiated."<BR/><BR/>Exactly!<BR/>So Oracle's W is Linux's R, Linux W is the hardware device's R, and so on.<BR/>Aggregating everything and arriving at the exact times spent in W vs S over all the layers seems nearly impossible, and as you wrote, it is usually not necessary.Chen Shapirahttps://www.blogger.com/profile/14535067086703072776noreply@blogger.comtag:blogger.com,1999:blog-2954359812249072053.post-86810264169934069292009-03-02T16:36:00.000-06:002009-03-02T16:36:00.000-06:00Marcin,In only one project have I used queueing th...Marcin,<BR/><BR/>In only one project have I used queueing theory to produce the principal deliverable for the project. It delivered results that were excellent approximations of reality. We modeled only at a very high level compared to this conversation, though. We modeled a cluster of Oracle databases as the service provider, and a Tuxedo-based application as the requester.<BR/><BR/>I do like to <EM>think</EM> about lower-level details in the context of queueing theory, because it helps me to understand (and explain, I hope) general tendencies more clearly. ...Like why it gives you better leverage to eliminate unnecessary work than to upgrade the capacity of your system.<BR/><BR/>However, trying to use queueing theory at too low a level of abstraction in our field is dangerous. Queueing systems in real life are very sensitive to small input changes. Queueing models accurately reflect that. So if you choose to model a low, very granular level of abstraction, then even tiny mistakes in how you compose the model or how measure the model's input values can produce wildly inaccurate predictions.<BR/><BR/>One use that intrigues me is what Connie Smith and Lloyd Williams do in their book <A HREF="http://www.amazon.com/gp/product/0201722291?ie=UTF8&tag=methodrcom-20&linkCode=as2&camp=1789&creative=9325&creativeASIN=0201722291" REL="nofollow">Performance Solutions: A Practical Guide to Creating Responsive, Scalable Software (Addison-Wesley Object Technology Series)</A>.<BR/><BR/>I appreciate your comments, and I can assure you that your communications with me in English are clearer than would be any attempt I might make to communicate with you in your first language. :-)Cary Millsaphttps://www.blogger.com/profile/16697498718050285274noreply@blogger.comtag:blogger.com,1999:blog-2954359812249072053.post-54015086408330533232009-03-02T15:57:00.000-06:002009-03-02T15:57:00.000-06:00Cary,I once again have to agree with you.If we are...Cary,<BR/><BR/>I once again have to agree with you.<BR/>If we are talking about M/M/1 (or in general about M/M/n) model, Oracle "waits" - syscalls - are response time including service time (disk read time) and queue time of underlying OS service. That's of course true. <BR/>Using that model of system we can try to model our database using math or modeling tools.<BR/>BTW<BR/>Did you ever model Oracle DB using that approach ? <BR/><BR/>But if we want to find out what happen inside our application, and if we assume that our database can be in execute mode (using CPU) and executing code and in waiting mode (waiting for io/memory/other resources) a syscall/Oracle_internal_call called as wait are not a very bad idea.<BR/><BR/>ps.<BR/>I hope I didn't mix up my answer to you. English is not my first language.<BR/><BR/>regards,<BR/>Marcin PrzepiorowskiMarcin Przepiorowskihttps://www.blogger.com/profile/15133397892511680504noreply@blogger.comtag:blogger.com,1999:blog-2954359812249072053.post-76768092418619536882009-03-02T12:43:00.000-06:002009-03-02T12:43:00.000-06:00Chen, when I wrote that, I was imagining a syscall...Chen, when I wrote that, I was imagining a syscall that has a sleep naturally in its code path, like the read() call you mentioned.<BR/><BR/>Imagine your read() call executing on a system with no competing workload and thus no queueing at any point in the call chain. The read() politely executes a sleep to free up your proc's CPU for someone else, because the author of read() knew that the hardware device service time is huge compared to the capacity of the CPU whence the call was made.<BR/><BR/>I think that the duration of the sleep executed in that context would be properly regarded as <EM>S</EM> time from the caller's (e.g., Oracle kernel's) <EM>R</EM> = <EM>S</EM> + <EM>W</EM> perspective. However, on a system in which there was queueing at the hardware device being read, the duration of that queueing is the duration that I would assign to the <EM>W</EM> slot.<BR/><BR/>The problem you've highlighted is that I accidentally mixed layers (in the sequence diagram sense) in my statement that you questioned. Sometimes, the entire duration of the sleep will be <EM>S</EM>, and sometimes it will include some non-zero <EM>W</EM>. But the place to measure the <EM>S</EM> and <EM>W</EM> values for that part of your read call would be on the device to which control in the code path has been passed, which is the hardware device servicing the call. ...Not the device from which the call was made.<BR/><BR/>Every time control passes from one tier to a deeper one (Oracle to OS, OS to device), there's a new <EM>R</EM> being instantiated.Cary Millsaphttps://www.blogger.com/profile/16697498718050285274noreply@blogger.comtag:blogger.com,1999:blog-2954359812249072053.post-51859342829654644562009-03-02T12:00:00.000-06:002009-03-02T12:00:00.000-06:00Cary,Can you explain about sleeping being part of ...Cary,<BR/><BR/>Can you explain about sleeping being part of S and not W?<BR/><BR/>At least the linux kernel will force processes to sleep while waiting for IO, which will count as W. When would sleep be counted as S?Chen Shapirahttps://www.blogger.com/profile/14535067086703072776noreply@blogger.comtag:blogger.com,1999:blog-2954359812249072053.post-43423344800661824532009-03-02T11:54:00.000-06:002009-03-02T11:54:00.000-06:00Marcin, I agree that what you're talking about is ...Marcin, I agree that what you're talking about is a "wait," but only in the very loose definition #1 (from the original blog post) sense of the word. The problem with that perspective begins when someone thinks of <EM>W</EM> from the <EM>R</EM> = <EM>S</EM> + <EM>W</EM> formula when you say the word "wait."<BR/><BR/>From the database kernel's perspective the duration of a syscall is a <EM>response time</EM>. That is, it's an <EM>R</EM>, not a <EM>W</EM>. That response time consists of service (sometimes that service includes sleeping, which is <EM>also</EM> NOT a <EM>W</EM>) and, potentially, queueing (which <EM>is</EM> in fact <EM>wait</EM>ing, by the stricter definition #2). The "wait" events that Oracle reports on are just not <EM>W</EM> values at all.Cary Millsaphttps://www.blogger.com/profile/16697498718050285274noreply@blogger.comtag:blogger.com,1999:blog-2954359812249072053.post-70745252041292637402009-03-02T03:57:00.000-06:002009-03-02T03:57:00.000-06:00Cary,Naming convention is depending where are you ...Cary,<BR/>Naming convention is depending where are you sitting as observer.<BR/>Of course from OS point of view Oracle wait are not wait's but there is a work which have to be done (call syscall). From Oracle point of view waiting for DB block is just a WAIT (even if during that time server CPU or IO is not in idle state). <BR/>I really appreciate your work and your book about Oracle waits.<BR/><BR/>regards,<BR/>Marcin PrzepiorowskiMarcin Przepiorowskihttps://www.blogger.com/profile/15133397892511680504noreply@blogger.comtag:blogger.com,1999:blog-2954359812249072053.post-45543547580904793262009-02-27T22:39:00.000-06:002009-02-27T22:39:00.000-06:00Cary-Excellent analysis on what these performance ...Cary-<BR/><BR/>Excellent analysis on what these performance metrics break down into at the core level. Even 1 second can make a crucial difference in performance for mission critical systems.skymasterhttps://www.blogger.com/profile/00326519060051833791noreply@blogger.comtag:blogger.com,1999:blog-2954359812249072053.post-78650783718038023112009-02-20T22:41:00.000-06:002009-02-20T22:41:00.000-06:00Too bad the TPS Report is 40 pages long and has ov...Too bad the TPS Report is 40 pages long and has over 50,000 lines on it. But then again, that's probably another blog post :).<BR/><BR/>I think the presentation at RMOUG really helped people that didn't understand the difference between the different stages of "wait" in the environment. I know it helped me too. <BR/><BR/>Good stuff!Dan Norrishttps://www.blogger.com/profile/09711669745371007306noreply@blogger.comtag:blogger.com,1999:blog-2954359812249072053.post-11740599018190717762009-02-20T14:51:00.000-06:002009-02-20T14:51:00.000-06:00Chen: Exactly. When ela=R for a db file sequential...Chen: Exactly. When ela=<EM>R</EM> for a <EM>db file sequential read</EM>, you can't know by looking at Oracle data how much of that <EM>R</EM> is <EM>S</EM> and how much is <EM>W</EM>. Your observation is spot on.Cary Millsaphttps://www.blogger.com/profile/16697498718050285274noreply@blogger.comtag:blogger.com,1999:blog-2954359812249072053.post-67819639828766067792009-02-20T12:14:00.000-06:002009-02-20T12:14:00.000-06:00Good distinction. It seems like you have very litt...Good distinction. It seems like you have very little visibility into the real waits in the system.<BR/><BR/>Some of these syscalls are pure waiting time (latch related events for instance), but certainly not all of them.<BR/><BR/>If you see 5 seconds "wait" to "db sequential read", it could be 5 seconds spent reading LOTS of data, or 1 second spent reading and 4 seconds spent really waiting on the IO queue.Chen Shapirahttps://www.blogger.com/profile/14535067086703072776noreply@blogger.com