Index: ossp-pkg/sio/BRAINSTORM/Apache-Dean-Thoughts.txt RCS File: /v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/Apache-Dean-Thoughts.txt,v co -q -kk -p'1.1' '/v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/Apache-Dean-Thoughts.txt,v' | diff -u /dev/null - -L'ossp-pkg/sio/BRAINSTORM/Apache-Dean-Thoughts.txt' 2>/dev/null --- ossp-pkg/sio/BRAINSTORM/Apache-Dean-Thoughts.txt +++ - 2024-05-20 02:18:02.320841416 +0200 @@ -0,0 +1,114 @@ +From dgaudet@arctic.org Mon Jun 28 19:06:50 1999 +Path: engelschall.com!mail2news!apache.org!new-httpd-owner-rse+apache=en.muc.de +From: dgaudet@arctic.org (Dean Gaudet) +Newsgroups: en.lists.apache-new-httpd +Subject: Re: async routines +Date: 28 Jun 1999 17:33:24 +0200 +Organization: Mail2News at engelschall.com +Lines: 96 +Approved: postmaster@m2ndom +Message-ID: +Reply-To: new-httpd@apache.org +NNTP-Posting-Host: en1.engelschall.com +X-Trace: en1.engelschall.com 930584004 99816 141.1.129.1 (28 Jun 1999 15:33:24 GMT) +X-Complaints-To: postmaster@engelschall.com +NNTP-Posting-Date: 28 Jun 1999 15:33:24 GMT +X-Mail2News-Gateway: mail2news.engelschall.com +Xref: engelschall.com en.lists.apache-new-httpd:31280 + +[hope you don't mind me cc'ing new-httpd zach, I think others will be +interested.] + +On Mon, 28 Jun 1999, Zach Brown wrote: + +> so dean, I was wading through the mpm code to see if I could munge the +> sigwait stuff into it. +> +> as far as I could tell, the http protocol routines are still blocking. +> what does the future hold in the way for async routines? :) I basically +> need a way to do something like.. + +You're still waiting for me to get the async stuff in there... I've done +part of the work -- the BUFF layer now supports non-blocking sockets. + +However, the HTTP code will always remain blocking. There's no way I'm +going to try to educate the world in how to write async code... and since +our HTTP code has arbitrary call outs to third party modules... It'd +have a drastic effect on everyone to make this change. + +But I honestly don't think this is a problem. Here's my observations: + +All the popular HTTP clients send their requests in one packet (or two +in the case of a POST and netscape). So the HTTP code would almost +never have to block while processing the request. It may block while +processing a POST -- something which someone else can worry about later, +my code won't be any worse than what we already have in apache. So +any effort we put into making the HTTP parsing code async-safe would +be wasted on the 99.9% case. + +Most responses fit in the socket's send buffer, and again don't require +async support. But we currently do the lingering_close() routine which +could easily use async support. Large responses also could use async +support. + +The goal of HTTP parsing is to figure out which response object to +send. In most cases we can reduce that to a bunch of common response +types: + +- copying a file to the socket +- copying a pipe/socket to the socket (IPC, CGIs) +- copying a mem region to the socket (mmap, some dynamic responses) + +So what we do is we modify the response handlers only. We teach them +about how to send async responses. + +There will be a few new primitives which will tell the core "the response +fits one of these categories, please handle it". The core will do the +rest -- and for MPMs which support async handling, the core will return +to the MPM and let the MPM do the work async... the MPM will call a +completion function supplied by the core. (Note that this will simplify +things for lots of folks... for example, it'll let us move range request +handling to a common spot so that more than just default_handler +can support it.) + +I expect this to be a simple message passing protocol (pass by reference). +Well rather, that's how I expect to implement it in ASH -- where I'll +have a single thread per-process doing the select/poll stuff; and the +other threads are in a pool that handles the protocol stuff. For your +stuff you may want to do it another way -- but we'll be using a common +structure that the core knows about... and that structure will look like +a message: + + struct msg { + enum { + MSG_SEND_FILE, + MSG_SEND_PIPE, + MSG_SEND_MEM, + MSG_LINGERING_CLOSE, + MSG_WAIT_FOR_READ, /* for handling keep-alives */ + ... + } type; + BUFF *client; + void (*completion)(struct msg *, int status); + union { + ... extra data here for whichver types need it ...; + } x; + }; + +The nice thing about this is that these operations are protocol +independant... at this level there's no knowledge of HTTP, so the same +MPM core could be used to implement other protocols. + +> so as I was thinking about this stuff, I realized it might be neat to have +> 'classes' of non blocking pending work and have different threads with +> differnt priorities hacking on it. Say we have a very high priority +> thread that accepts connectoins, does initial header parsing, and +> sendfile()ing data out. We could have lower priority threads that are +> spinning doing 'harder' BUFF work like an encryption layer or gziping +> content, whatever. + +You should be able to implement this in your MPM easily I think... because +you'll see the different message types and can distribute them as needed. + +Dean + Index: ossp-pkg/sio/BRAINSTORM/CS95-441.ps.L RCS File: /v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/CS95-441.ps.L,v co -q -kk -p'1.1' '/v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/CS95-441.ps.L,v' | diff -u /dev/null - -L'ossp-pkg/sio/BRAINSTORM/CS95-441.ps.L' 2>/dev/null --- ossp-pkg/sio/BRAINSTORM/CS95-441.ps.L +++ - 2024-05-20 02:18:02.323525992 +0200 @@ -0,0 +1 @@ +http://www.cs.ucsd.edu/groups/csl/pubs/conf/sosp95.html Index: ossp-pkg/sio/BRAINSTORM/CS95-441.ps.gz RCS File: /v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/CS95-441.ps.gz,v co -q -kk -p'1.1' '/v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/CS95-441.ps.gz,v' | diff -u /dev/null - -L'ossp-pkg/sio/BRAINSTORM/CS95-441.ps.gz' 2>/dev/null Binary files ossp-pkg/sio/BRAINSTORM/CS95-441.ps.gz and - differ Index: ossp-pkg/sio/BRAINSTORM/IO-Lite-TR97-269.ps.gz RCS File: /v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/IO-Lite-TR97-269.ps.gz,v co -q -kk -p'1.1' '/v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/IO-Lite-TR97-269.ps.gz,v' | diff -u /dev/null - -L'ossp-pkg/sio/BRAINSTORM/IO-Lite-TR97-269.ps.gz' 2>/dev/null Binary files ossp-pkg/sio/BRAINSTORM/IO-Lite-TR97-269.ps.gz and - differ Index: ossp-pkg/sio/BRAINSTORM/IO-Lite-iol98.ps.gz RCS File: /v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/IO-Lite-iol98.ps.gz,v co -q -kk -p'1.1' '/v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/IO-Lite-iol98.ps.gz,v' | diff -u /dev/null - -L'ossp-pkg/sio/BRAINSTORM/IO-Lite-iol98.ps.gz' 2>/dev/null Binary files ossp-pkg/sio/BRAINSTORM/IO-Lite-iol98.ps.gz and - differ Index: ossp-pkg/sio/BRAINSTORM/IO-Lite-presentation.ps.gz RCS File: /v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/IO-Lite-presentation.ps.gz,v co -q -kk -p'1.1' '/v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/IO-Lite-presentation.ps.gz,v' | diff -u /dev/null - -L'ossp-pkg/sio/BRAINSTORM/IO-Lite-presentation.ps.gz' 2>/dev/null Binary files ossp-pkg/sio/BRAINSTORM/IO-Lite-presentation.ps.gz and - differ Index: ossp-pkg/sio/BRAINSTORM/IO-Lite.txt RCS File: /v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/IO-Lite.txt,v co -q -kk -p'1.1' '/v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/IO-Lite.txt,v' | diff -u /dev/null - -L'ossp-pkg/sio/BRAINSTORM/IO-Lite.txt' 2>/dev/null --- ossp-pkg/sio/BRAINSTORM/IO-Lite.txt +++ - 2024-05-20 02:18:02.336766233 +0200 @@ -0,0 +1,74 @@ +From jg@pa.dec.com Wed Mar 3 08:37:08 1999 +Path: engelschall.com!mail2news!apache.org!new-httpd-owner-rse+apache=en.muc.de +From: jg@pa.dec.com (Jim Gettys) +Newsgroups: en.lists.apache-new-httpd +Subject: OSDI paper - IO-Lite: A Unified I/O Buffering and Caching System +Date: 3 Mar 1999 07:14:53 +0100 +Organization: Mail2News at engelschall.com +Lines: 56 +Approved: postmaster@m2ndom +Message-ID: <9903021931.AA13619@pachyderm.pa.dec.com> +Reply-To: new-httpd@apache.org +NNTP-Posting-Host: en1.engelschall.com +X-Trace: en1.engelschall.com 920441693 44447 141.1.129.1 (3 Mar 1999 06:14:53 GMT) +X-Complaints-To: postmaster@engelschall.com +NNTP-Posting-Date: 3 Mar 1999 06:14:53 GMT +X-Mail2News-Gateway: mail2news.engelschall.com +Xref: engelschall.com en.lists.apache-new-httpd:29016 + +I am doing something that I seldom do: cross posting between two high volume +mailing lists. (linux-kernel and the Apache developer's lists). Sometimes +it is useful for us folks who build applications to attend base operating +system conferences, which I had not for a while. + +I encourage everyone to read the paper: + +IO-Lite: A Unified I/O Buffering and Caching System, by Vivek Pai, Peter +Druschel, and Willy Zwaenepoel, published in the 3rd Symposium on Operating +Systems Design and Implementation (OSDI '99) Proceedings, New Orleans, +Louisiana, February 22-25, 1999, pp15-28. + +http://www.cs.rice.edu/~vivek/iol98/ (OSDI paper) +http://www.cs.rice.edu/~vivek/vivekmsee/ (MS thesis) + +At the latest (3rd) Symposium on Operating System Design and Implementation, +the "best of conference" award went to this paper, which reports on both +the design and implementation of a unified IO scheme. I think the award was +well placed. + +It shows a new I/O approach that is very general and flexible, and avoids +data copies with minimal overhead, even between processes. The authors +use as an example Web service, and show very good performance gains. + +While I believe the paper overstates the benefits for "vanilla" web service, +for CGI it should clearly be a major win. IO-Lite avoids the redundant +data copies that normally occur in the standard UNIX read/write semantics, +and copies in the network layer. + +My intution tells me that the interfaces proposed here should be very +useful for a very wide range of applications (including a certain window +system I'm a bit fond of, which already took advantage of writev; the +IO-Lite interfaces look better to me). + +For full benefits to be reaped, an application can use new system call +interfaces that IO-Lite introduces. The immediate application that comes +to my mind is Apache (ergo the cross posting), particularly as Apache +thinks through its V2 design. I hope that we can have a productive +discussion and this thread may be able to provide clarification (I cc'ed +the authors of the paper), and encourage IO-Lite's adoption. + +This has been implemented in FreeBSD... And, of course, it would be nice +to have it in Linux as well, and for Apache to be able to take full +advantage of IO-Lite. + + - Jim Gettys + + +-- +Jim Gettys +Industry Standards and Consortia +Compaq Computer Corporation +Visting Scientist, World Wide Web Consortium, M.I.T. +http://www.w3.org/People/Gettys/ +jg@w3.org, jg@pa.dec.com + Index: ossp-pkg/sio/BRAINSTORM/IO.MAIL RCS File: /v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/IO.MAIL,v co -q -kk -p'1.1' '/v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/IO.MAIL,v' | diff -u /dev/null - -L'ossp-pkg/sio/BRAINSTORM/IO.MAIL' 2>/dev/null --- ossp-pkg/sio/BRAINSTORM/IO.MAIL +++ - 2024-05-20 02:18:02.339312343 +0200 @@ -0,0 +1,39 @@ +From fielding@kiwi.ics.uci.edu Sat Apr 17 13:19:21 1999 +Path: engelschall.com!mail2news!apache.org!new-httpd-owner-rse+apache=en.muc.de +From: fielding@kiwi.ics.uci.edu ("Roy T. Fielding") +Newsgroups: en.lists.apache-new-httpd +Subject: Re: New idea for apr types. +Date: 17 Apr 1999 07:02:19 +0200 +Organization: Mail2News at engelschall.com +Lines: 21 +Approved: postmaster@m2ndom +Message-ID: <9904161620.aa25418@paris.ics.uci.edu> +Reply-To: new-httpd@apache.org +NNTP-Posting-Host: en1.engelschall.com +X-Trace: en1.engelschall.com 924325339 88339 141.1.129.1 (17 Apr 1999 05:02:19 GMT) +X-Complaints-To: postmaster@engelschall.com +NNTP-Posting-Date: 17 Apr 1999 05:02:19 GMT +X-Mail2News-Gateway: mail2news.engelschall.com +Xref: engelschall.com en.lists.apache-new-httpd:29882 + +>The basic idea, is to have apr types implemented as objects. Each type +>has a pointer to every function that is allowed by that type. So, +>connection types (apr_socket_t, apr_corba_t, apr_sna_t, apr_netbeui_t, +>etc) all implement the same basic functions (read, write, create, close, +>shutdown, etc), and each type contains a pointer to those functions. +> +>Then, all connection types could be referred to as void *, and the type +>itself would determine how it should do things under the covers. +> +>This would allow Apache to run over ANY network type, as long as there was +>an apr layer written for it. + +Ummm, yeah, it is called stacked disciplines in sfio, stream pipes in +Onions (my Ada95 library that nobody uses), buffer slices in IO-Lite, +streams in W3C-libwww, or layered I/O in the 2.0 wish list. + +I'm all for it, though I'd use my bucket brigade from Onions and the +IO-Lite memory stuff (assuming sfio wasn't good enough). + +....Roy + Index: ossp-pkg/sio/BRAINSTORM/apache-stackedio RCS File: /v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/apache-stackedio,v co -q -kk -p'1.1' '/v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/apache-stackedio,v' | diff -u /dev/null - -L'ossp-pkg/sio/BRAINSTORM/apache-stackedio' 2>/dev/null --- ossp-pkg/sio/BRAINSTORM/apache-stackedio +++ - 2024-05-20 02:18:02.341849334 +0200 @@ -0,0 +1,883 @@ +[djg: comments like this are from dean] + +This past summer, Alexei and I wrote a spec for an I/O Filters API... +this proposal addresses one part of that -- 'stacked' I/O with buff.c. + +We have a couple of options for stacked I/O: we can either use existing +code, such as sfio, or we can rewrite buff.c to do it. We've gone over +the first possibility at length, though, and there were problems with each +implemenation which was mentioned (licensing and compatibility, +specifically); so far as I know, those remain issues. + +Btw -- sfio will be supported w/in this model... it just wouldn't be the +basis for the model's implementation. + + -- Ed Korthof | Web Server Engineer -- + -- ed@organic.com | Organic Online, Inc -- + -- (415) 278-5676 | Fax: (415) 284-6891 -- + +--------------------------------------------------------------------------- +Stacked I/O With BUFFs + Sections: + + 1.) Overview + 2.) The API + User-supplied structures + API functions + 3.) Detailed Description + The bfilter structure + The bbottomfilter structure + The BUFF structure + Public functions in buff.c + 4.) Efficiency Considerations + Buffering + Memory copies + Function chaining + writev + 5.) Code in buff.c + Default Functions + Heuristics for writev + Writing + Reading + Flushing data + Closing stacks and filters + Flags and Options + +************************************************************************* + Overview + +The intention of this API is to make Apache's BUFF structure modular +while retaining high efficiency. Basically, it involves rewriting +buff.c to provide 'stacked' I/O -- where the data passed through a +series of 'filters', which may modify it. + +There are two parts to this, the core code for BUFF structures, and the +"filters" used to implement new behavior. "filter" is used to refer to +both the sets of 5 functions, as shown in the bfilter structure in the +next section, and to BUFFs which are created using a specific bfliter. +These will also be occasionally refered to as "user-supplied", though +the Apache core will need to use these as well for basic functions. + +The user-supplied functions should use only the public BUFF API, rather +than any internal details or functions. One thing which may not be +clear is that in the core BUFF functions, the BUFF pointer passed in +refers to the BUFF on which the operation will happen. OTOH, in the +user-supplied code, the BUFF passed in is the next buffer down the +chain, not the current one. + +************************************************************************* + The API + + User-supplied structures + +First, the bfilter structure is used in all filters: + typedef struct { + int (*writev)(BUFF *, void *, struct iovect *, int); + int (*read)(BUFF *, void *, char *, int); + int (*write)(BUFF *, void *, const char *, int); + int (*flush)(BUFF *, void *, const char *, int, bfilter *); + int (*transmitfile)(BUFF *, void *, file_info_ptr *); + void (*close)(BUFF *, void *); + } bfilter; + +bfilters are placed into a BUFF structure along with a +user-supplied void * pointer. + +Second, the following structure is for use with a filter which can +sit at the bottom of the stack: + + typedef struct { + void *(*bgetfileinfo)(BUFF *, void *); + void (*bpushfileinfo)(BUFF *, void *, void *); + } bbottomfilter; + + + BUFF API functions + +The following functions are new BUFF API functions: + +For filters: + +BUFF * bcreatestack(pool *p, int flags, struct bfilter *, + struct bbottomfilter *, void *); +BUFF * bpushfilter (BUFF *, struct bfilter *, void *); +BUFF * bpushbuffer (BUFF *, BUFF *); +BUFF * bpopfilter(BUFF *); +BUFF * bpopbuffer(BUFF *); +void bclosestack(BUFF *); + +For BUFFs in general: + +int btransmitfile(BUFF *, file_info_ptr *); +int bsetstackopts(BUFF *, int, const void *); +int bsetstackflags(BUFF *, int, int); + +Note that a new flag is needed for bsetstackflags: +B_MAXBUFFERING + +The current bcreate should become + +BUFF * bcreatebuffer (pool *p, int flags, struct bfilter *, void *); + +************************************************************************* + Detailed Explanation + + bfilter structure + +The void * pointer used in all these functions, as well as those in the +bbottomfilter structure and the filter API functions, is always the same +pointer w/in an individual BUFF. + +The first function in a bfilter structure is 'writev'; this is only +needed for high efficiency writing, generally at the level of the system +interface. In it's absence, multiple writes will be done w/ 'write'. +Note that defining 'writev' means you must define 'write'. + +The second is 'write'; this is the generic writing function, taking a BUFF +* to which to write, a block of text, and the length of that block of +text. The expected return is the number of characters (out of that block +of text) which were successfully processed (rather than the number of +characters actually written). + +The third is 'read'; this is the generic reading function, taking a BUFF * +from which to read data, and a void * buffer in which to put text, and the +number of characters to put in that buffer. The expected return is the +number of characters placed in the buffer. + +The fourth is 'flush'; this is intended to force the buffer to spit out +any data it may have been saving, as well as to clear any data the +BUFF code was storing. If the third argument is non-null, then it +contains more text to be printed; that text need not be null terminated, +but the fourth argument contains the length of text to be processed. The +expected return value should be the number of characters handled out +from the third argument (0 if there are none), or -1 on error. Finally, +the fifth argument is a pointer to the bfilter struct containing this +function, so that it may use the write or writev functions in it. Note +that general buffering is handled by BUFF's internal code, and module +writers should not store data for performance reasons. + +The fifth is 'transmitfile', which takes as its arguments a buffer to +which to write (if non-null), the void * pointer containing configuration +(or other) information for this filter, and a system-dependent pointer +(the file_info_ptr structure will be defined on a per-system basis) +containing information required to print the 'file' in question. +This is intended to allow zero-copy TCP in Win32. + +The sixth is 'close'; this is what is called when the connection is being +closed. The 'close' should not be passed on to the next filter in the +stack. Most filters will not need to use this, but if database handles +or some other object is created, this is the point at which to remove it. +Note that flush is called automatically before this. + + bbottomfilter Structure + +The first function, bgetfileinfo, is designed to allow Apache to get +information from a BUFF struct regarding the input and output sources. +This is currently used to get the input file number to select on a +socket to see if there's data waiting to be read. The information +returned is platform specific; the void * pointer passed in holds +the void * pointer passed to all user-supplied functions. + +The second function, bpushfileinfo, is used to push file information +onto a buffer, so that the buffer can be fully constructed and ready +to handle data as soon as possible after a client has connected. +The first void * pointer holds platform specific information (in +Unix, it would be a pair of file descriptors); the second holds the +void * pointer passed to all user-supplied functions. + +[djg: I don't think I really agree with the distinction here between +the bottom and the other filters. Take the select() example, it's +valid for any layer to define a fd that can be used for select... +in fact it's the topmost layer that should really get to make this +definition. Or maybe I just have your top and bottom flipped. In +any event I think this should be part of the filter structure and +not separate.] + + The BUFF structure + +A couple of changes are needed for this structure: remove fd and +fd_in; add a bfilter structure; add a pointer to a bbottomfilter; +add three pointers to the next BUFFs: one for the next BUFF in the +stack, one for the next BUFF which implements write, and one +for the next BUFF which implements read. + + + Public functions in buff.c + +BUFF * bpushfilter (BUFF *, struct bfilter *, void *); + +This function adds the filter functions from bfilter, stacking them on +top of the BUFF. It returns the new top BUFF, or NULL on error. + +BUFF * bpushbuffer (BUFF *, BUFF *); + +This function places the second buffer on the top of the stack that +the first one is on. It returns the new top BUFF, or NULL on error. + +BUFF * bpopfilter(BUFF *); +BUFF * bpopbuffer(BUFF *); + +Unattaches the top-most filter from the stack, and returns the new +top-level BUFF, or NULL on error or when there are no BUFFs +remaining. The two are synonymous. + +void bclosestack(BUFF *); + +Closes the I/O stack, removing all the filters in it. + +BUFF * bcreatestack(pool *p, int flags, struct bfilter *, + struct bbottomfilter *, void *); + +This creates an I/O stack. It returns NULL on error. + +BUFF * bcreatebuffer(pool *p, int flags, struct bfilter *, void *); + +This creates a BUFF for later use with bpushbuffer. The BUFF is +not set up to be used as an I/O stack, however. It returns NULL +on error. + +int bsetstackopts(BUFF *, int, const void *); +int bsetstackflags(BUFF *, int, int); + +These functions, respectively, set options on all the BUFFs in a +stack. The new flag, B_MAXBUFFERING is used to disable a feature +described in the next section, whereby only the first and last +BUFFs will buffer data. + +************************************************************************* + Efficiency Considerations + + Buffering + +All input and output is buffered by the standard buffering code. +People writing code to use buff.c should not concern themselves with +buffering for efficiency, and should not buffer except when necessary. + +The write function will typically be called with large blocks of text; +the read function will attempt to place the specified number of bytes +into the buffer. + +Dean noted that there are possible problems w/ multiple buffers; +further, some applications must not be buffered. This can be +partially dealt with by turning off buffering, or by flushing the +data when appropriate. + +However, some potential problems arise anyway. The simplest example +involves shrinking transformations; suppose that you have a set +of filters, A, B, and C, such that A outputs less text than it +recieves, as does B (say A strips comments, and B gzips the result). +Then after a write to A which fills the buffer, A writes to B. +However, A won't write enough to fill B's buffer, so a memory copy +will be needed. This continues till B's buffer fills up, then +B will write to C's buffer -- with the same effect. + +[djg: I don't think this is the issue I was really worried about -- +in the case of shrinking transformations you are already doing +non-trivial amounts of CPU activity with the data, and there's +no copying of data that you can eliminate anyway. I do recognize +that there are non-CPU intensive filters -- such as DMA-capable +hardware crypto cards. I don't think they're hard to support in +a zero-copy manner though.] + +The maximum additional number of bytes which will be copied in this +scenario is on the order of nk, where n is the total number of bytes, +and k is the number of filters doing shrinking transformations. + +There are several possible solutions to this issue. The first +is to turn off buffering in all but the first filter and the +last filter. This reduces the number of unnecessary byte copies +to at most one per byte, however it means that the functions in +the stack will get called more frequently; but it is the default +behavior, overridable by setting the B_MAXBUFFERING with +bsetstackflags. Most filters won't involve a net shrinking +transformation, so even this will rarely be an issue; however, +if the filters do involve a net shrinking transformation, for +the sake of network-efficiency (sending reasonably sized blocks), +it may be more efficient anyway. + +A second solution is more general use of writev for communication +between different buffers. This complicates the programing work, +however. + + + Memory copies + +Each write function is passed a pointer to constant text; if any changes +are being made to the text, it must be copied. However, if no changes +are made to the text (or to some smaller part of it), then it may be +sent to the next filter without any additional copying. This should +provide the minimal necessary memory copies. + +[djg: Unfortunately this makes it hard to support page-flipping and +async i/o because you don't have any reference counts on the data. +But I go into a little detail that already in docs/page_io.] + + Function chaining + +In order to avoid unnecessary function chaining for reads and writes, +when a filter is pushed onto the stack, the buff.c code will determine +which is the next BUFF which contains a read or write function, and +reads and writes, respectively, will go directly to that BUFF. + + writev + +writev is a function for efficient writing to the system; in terms of +this API, however, it also works for dealing with multiple blocks of +text without doing unnecessary byte copies. It is not required. + +Currently, the system level writev is used in two contexts: for +chunking and when a block of text is writen which, combined with +the text already in the buffer, would make the buffer overflow. + +writev would be implemented both by the default bottom level filter +and by the chunking filter for these operations. In addition, writev +may, be used, as noted above, to pass multiple blocks of text w/o +copying them into a single buffer. Note that if the next filter does +not implement writev, however, this will be equivalent to repeated +calls to write, which may or may not be more efficient. Up to +IOV_MAX-2 blocks of text may be passed along in this manner. Unlike +the system writev call, the writev in this API should be called only +once, with a array with iovec's and a count as to the number of +iovecs in it. + +If a bfilter defines writev, writev will be called whether or not +NO_WRITEV is set; hence, it should deal with that case in a reasonable +manner. + +[djg: We can't guarantee atomicity of writev() when we emulate it. +Probably not a problem, just an observation.] + +************************************************************************* + Code in buff.c + + Default Functions + +The default actions are generally those currently performed by Apache, +save that they they'll only attempt to write to a buffer, and they'll +return an error if there are no more buffers. That is, you must implement +read, write, and flush in the bottom-most filter. + +Except for close(), the default code will simply pass the function call +on to the next filter in the stack. Some samples follow. + + Heuristics for writev + +Currently, we call writev for chunking, and when we get a enough so that +the total overflows the buffer. Since chunking is going to become a +filter, the chunking filter will use writev; in addition, bwrite will +trigger bwritev as shown (note that system specific information should +be kept at the filter level): + +in bwrite: + + if (fb->outcnt > 0 && nbyte + fb->outcnt >= fb->bufsiz) { + /* build iovec structs */ + struct iovec vec[2]; + vec[0].iov_base = (void *) fb->outbase; + vec[0].iov_len = fb->outcnt; + fb->outcnt = 0; + vec[1].iov_base = (void *)buff; + vec[1].iov_length = nbyte; + return bwritev (fb, vec, 2); + } else if (nbye >= fb->bufsiz) { + return write_with_errors(fb,buff,nbyte); + } + +Note that the code above takes the place of large_write (as well +as taking code from it). + +So, bwritev would look something like this (copying and pasting freely +from the current source for writev_it_all, which could be replaced): + +----- +int bwritev (BUFF * fb, struct iovec * vec, int nvecs) { + if (!fb) + return -1; /* the bottom level filter implemented neither write nor + * writev. */ + if (fb->bfilter.bwritev) { + return bf->bfilter.writev(fb->next, vec, nvecs); + } else if (fb->bfilter.write) { + /* while it's nice an easy to build the vector and crud, it's painful + * to deal with partial writes (esp. w/ the vector) + */ + int i = 0,rv; + while (i < nvecs) { + do { + rv = fb->bfilter.write(fb, vec[i].iov_base, vec[i].iov_len); + } while (rv == -1 && (errno == EINTR || errno == EAGAIN) + && !(fb->flags & B_EOUT)); + if (rv == -1) { + if (errno != EINTR && errno != EAGAIN) { + doerror (fb, B_WR); + } + return -1; + } + fb->bytes_sent += rv; + /* recalculate vec to deal with partial writes */ + while (rv > 0) { + if (rv < vec[i].iov_len) { + vec[i].iov_base = (char *)vec[i].iov_base + rv; + vec[i].iov_len -= rv; + rv = 0; + if (vec[i].iov_len == 0) { + ++i; + } + } else { + rv -= vec[i].iov_len; + ++i; + } + } + if (fb->flags & B_EOUT) + return -1; + } + /* if we got here, we wrote it all */ + return 0; + } else { + return bwritev(fb->next,vec,nvecs); + } +} +----- +The default filter's writev function will pretty much like +writev_it_all. + + + Writing + +The general case for writing data is significantly simpler with this +model. Because special cases are not dealt with in the BUFF core, +a single internal interface to writing data is possible; I'm going +to assume it's reasonable to standardize on write_with_errors, but +some other function may be more appropriate. + +In the revised bwrite (which I'll ommit for brievity), the following +must be done: + check for error conditions + check to see if any buffering is done; if not, send the data + directly to the write_with_errors function + check to see if we should use writev or write_with_errors + as above + copy the data to the buffer (we know it fits since we didn't + need writev or write_with_errors) + +The other work the current bwrite is doing is + ifdef'ing around NO_WRITEV + numerous decisions regarding whether or not to send chunks + +Generally, buff.c has a number of functions whose entire purpose is +to handle particular special cases wrt chunking, all of which could +be simplified with a chunking filter. + +write_with_errors would not need to change; buff_write would. Here +is a new version of it: + +----- +/* the lowest level writing primitive */ +static ap_inline int buff_write(BUFF *fb, const void *buf, int nbyte) +{ + if (fb->bfilter.write) + return fb->bfilter.write(fb->next_writer,buff,nbyte); + else + return bwrite(fb->next_writer,buff,nbyte); +} +----- + +If the btransmitfile function is called on a buffer which doesn't implement +it, the system will attempt to read data from the file identified +by the file_info_ptr structure and use other methods to write to it. + + Reading + +One of the basic reading functions in Apache 1.3b3 is buff_read; +here is how it would look within this spec: + +----- +/* the lowest level reading primitive */ +static ap_inline int buff_read(BUFF *fb, void *buf, int nbyte) +{ + int rv; + + if (!fb) + return -1; /* the bottom level filter is not set up properly */ + + if (fb->bfilter.read) + return fb->bfilter.read(fb->next_reader,buf,nbyte,fb->bfilter_info); + else + return bread(fb->next_reader,buff,nbyte); +} +----- +The code currently in buff_read would become part of the default +filter. + + + Flushing data + +flush will get passed on down the stack automatically, with recursive +calls to bflush. The user-supplied flush function will be called then, +and also before close is called. The user-supplied flush should not +call flush on the next buffer. + +[djg: Poorly written "expanding" filters can cause some nastiness +here. In order to flush a layer you have to write out your current +buffer, and that may cause the layer below to overflow a buffer and +flush it. If the filter is expanding then it may have to add more to +the buffer before flushing it to the layer below. It's possible that +the layer below will end up having to flush twice. It's a case where +writev-like capabilities are useful.] + + Closing Stacks and Filters + +When a filter is removed from the stack, flush will be called then close +will be called. When the entire stack is being closed, this operation +will be done automatically on each filter within the stack; generally, +filters should not operate on other filters further down the stack, +except to pass data along when flush is called. + + Flags and Options + +Changes to flags and options using the current functions only affect +one buffer. To affect all the buffers on down the chain, use +bsetstackopts or bsetstackflags. + +bgetopt is currently only used to grab a count of the bytes sent; +it will continue to provide that functionality. bgetflags is +used to provide information on whether or not the connection is +still open; it'll continue to provide that functionality as well. + +The core BUFF operations will remain, though some operations which +are done via flags and options will be done by attaching appropriate +filters instead (eg. chunking). + +[djg: I'd like to consider filesystem metadata as well -- we only need +a few bits of metadata to do HTTP: file size and last modified. We +need an etag generation function, it is specific to the filters in +use. You see, I'm envisioning a bottom layer which pulls data out of +a database rather than reading from a file.] + +------- + +This file is there so that I do not have to remind myself +about the reasons for Layered IO, apart from the obvious one. + +0. To get away from a 1 to 1 mapping + + i.e. a single URI can cause multiple backend requests, + in arbitrary configurations, such as in paralel, tunnel/piped, + or in some sort of funnel mode. Such multiple backend + requests, with fully layered IO can be treated exactly + like any URI request; and recursion is born :-) + +1. To do on the fly charset conversion + + Be, theoretically, be able to send out your content using + latin1, latin2 or any other charset; generated from static + _and_ dynamic content in other charsets (typically unicode + encoded as UTF7 or UTF8). Such conversion is prompted by + things like the user-agent string, a cookie, or other hints + about the capabilities of the OS, language preferences and + other (in)capabilities of the final receipient. + +2. To be able to do fancy templates + + Have your application/cgi sending out an XML structure of + field/value pair-ed contents; which is substituted into a + template by the web server; possibly based on information + accessible/known to the webserver which you do not want to + be known to the backend script. Ideally that template would + be just as easy to generate by a backend as well (see 0). + +3. On the fly translation + + And other general text and output mungling, such as translating + an english page in spanish whilst it goes through your Proxy, + or JPEG-ing a GIF generated by mod_perl+gd. + +Dw. +--------- + +From dgaudet@arctic.org Fri Feb 20 00:36:52 1998 +Date: Fri, 20 Feb 1998 00:35:37 -0800 (PST) +From: Dean Gaudet +To: new-httpd@apache.org +Subject: page-based i/o +X-Comment: Visit http://www.arctic.org/~dgaudet/legal for information regarding copyright and disclaimer. +Reply-To: new-httpd@apache.org + +Ed asked me for more details on what I mean when I talk about "paged based +zero copy i/o". + +While writing mod_mmap_static I was thinking about the primitives that the +core requires of the filesystem. What exactly is it that ties us into the +filesystem? and how would we abstract it? The metadata (last modified +time, file length) is actually pretty easy to abstract. It's also easy to +define an "index" function so that MultiViews and such can be implemented. +And with layered I/O we can hide the actual details of how you access +these "virtual" files. + +But therein lies an inefficiency. If we had only bread() for reading +virtual files, then we would enforce at least one copy of the data. +bread() supplies the place that the caller wants to see the data, and so +the bread() code has to copy it. But there's very little reason that +bread() callers have to supply the buffer... bread() itself could supply +the buffer. Call this new interface page_read(). It looks something like +this: + + typedef struct { + const void *data; + size_t data_len; /* amt of data on page which is valid */ + ... other stuff necessary for managing the page pool ... + } a_page_head; + + /* returns NULL if an error or EOF occurs, on EOF errno will be + * set to 0 + */ + a_page_head *page_read(BUFF *fb); + + /* queues entire page for writing, returns 0 on success, -1 on + * error + */ + int page_write(BUFF *fb, a_page_head *); + +It's very important that a_page_head structures point to the data page +rather than be part of the data page. This way we can build a_page_head +structures which refer to parts of mmap()d memory. + +This stuff is a little more tricky to do, but is a big win for performance. +With this integrated into our layered I/O it means that we can have +zero-copy performance while still getting the advantages of layering. + +But note I'm glossing over a bunch of details... like the fact that we +have to decide if a_page_heads are shared data, and hence need reference +counting (i.e. I said "queues for writing" up there, which means some +bit of the a_page_head data has to be kept until its actually written). +Similarly for the page data. + +There are other tricks in this area that we can take advantage of -- +like interprocess communication on architectures that do page flipping. +On these boxes if you write() something that's page-aligned and page-sized +to a pipe or unix socket, and the other end read()s into a page-aligned +page-sized buffer then the kernel can get away without copying any data. +It just marks the two pages as shared copy-on-write, and only when +they're written to will the copy be made. So to make this work, your +writer uses a ring of 2+ page-aligned/sized buffers so that it's not +writing on something the reader is still reading. + +Dean + +---- + +For details on HPUX and avoiding extra data copies, see +. + +(note that if you get the postscript version instead, you have to +manually edit it to remove the front page before any version of +ghostscript that I have used will read it) + +---- + +I've been told by an engineer in Sun's TCP/IP group that zero-copy TCP +in Solaris 2.6 occurs when: + + - you've got the right interface card (OC-12 ATM card I think) + - you use write() + - your write buffer is 16k aligned and a multiple of 16k in size + +We currently get the 16k stuff for free by using mmap(). But sun's +current code isn't smart enough to deal with our initial writev() +of the headers and first part of the response. + +---- + +Systems that have a system call to efficiently send the contents of a +descriptor across the network. This is probably the single best way +to do static content on systems that support it. + +HPUX: (10.30 and on) + + ssize_t sendfile(int s, int fd, off_t offset, size_t nbytes, + const struct iovec *hdtrl, int flags); + + (allows you to add headers and trailers in the form of iovec + structs) Marc has a man page; ask if you want a copy. Not included + due to copyright issues. man page also available from + http://docs.hp.com/ (in particular, + http://docs.hp.com:80/dynaweb/hpux11/hpuxen1a/rvl3en1a/@Generic__BookTextView/59894;td=3 ) + +Windows NT: + + BOOL TransmitFile( SOCKET hSocket, + HANDLE hFile, + DWORD nNumberOfBytesToWrite, + DWORD nNumberOfBytesPerSend, + LPOVERLAPPED lpOverlapped, + LPTRANSMIT_FILE_BUFFERS lpTransmitBuffers, + DWORD dwFlags + ); + + (does it start from the current position in the handle? I would + hope so, or else it is pretty dumb.) + + lpTransmitBuffers allows for headers and trailers. + + Documentation at: + + http://premium.microsoft.com/msdn/library/sdkdoc/wsapiref_3pwy.htm + http://premium.microsoft.com/msdn/library/conf/html/sa8ff.htm + + Even less related to page based IO: just context switching: + AcceptEx does an accept(), and returns the start of the + input data. see: + + http://premium.microsoft.com/msdn/library/sdkdoc/pdnds/sock2/wsapiref_17jm.htm + + What this means is you require one less syscall to do a + typical request, especially if you have a cache of handles + so you don't have to do an open or close. Hmm. Interesting + question: then, if TransmitFile starts from the current + position, you need a mutex around the seek and the + TransmitFile. If not, you are just limited (eg. byte + ranges) in what you can use it for. + + Also note that TransmitFile can specify TF_REUSE_SOCKET, so that + after use the same socket handle can be passed to AcceptEx. + Obviously only good where we don't have a persistent connection + to worry about. + +---- + +Note that all this is shot to bloody hell by HTTP-NG's multiplexing. +If fragment sizes are big enough, it could still be worthwhile to +do copy avoidence. It also causes performance issues because of +its credit system that limits how much you can write in a single +chunk. + +Don't tell me that if HTTP-NG becomes popular we will seen vendors +embedding SMUX (or whatever multiplexing is used) in the kernel to +get around this stuff. There we go, Apache with a loadable kernel +module. + +---- + +Larry McVoy's document for SGI regarding sendfile/TransmitFile: +ftp://ftp.bitmover.com/pub/splice.ps.gz +From dgaudet@arctic.org Sun Jun 20 11:07:58 1999 +Path: engelschall.com!mail2news!apache.org!new-httpd-owner-rse+apache=en.muc.de +From: dgaudet@arctic.org (Dean Gaudet) +Newsgroups: en.lists.apache-new-httpd +Subject: mpm update +Date: 19 Jun 1999 07:17:00 +0200 +Organization: Mail2News at engelschall.com +Lines: 104 +Approved: postmaster@m2ndom +Message-ID: +Reply-To: new-httpd@apache.org +NNTP-Posting-Host: en1.engelschall.com +X-Trace: en1.engelschall.com 929769420 64417 141.1.129.1 (19 Jun 1999 05:17:00 GMT) +X-Complaints-To: postmaster@engelschall.com +NNTP-Posting-Date: 19 Jun 1999 05:17:00 GMT +X-Mail2News-Gateway: mail2news.engelschall.com +Xref: engelschall.com en.lists.apache-new-httpd:31056 + +I imported mpm-3 into the apache-2.0 repository (tag mpm-3 if you want +it). + +Then I threw in a bunch of my recent email ramblings, because I'm getting +tired of repeating them, mostly off-list to folks who ask "why doesn't +apache do XYZ?" I intend to be more proactive in this area, because it +can only help. + +Then I ripped up BUFF and broke lots of stuff and put in a first crack at +layering. Info on that below. + +If you check out the tree, and build it (using Configuration.mpm) you +should be able to serve up the top page of the manual, that's all I've +tested so far ;) + +Dean + +goals? we need an i/o abstraction which has these properties: + +- buffered and non-buffered modes + + The buffered mode should look like FILE *. + + The non-buffered mode should look more like read(2)/write(2). + +- blocking and non-blocking modes + + The blocking mode is the "easy" mode -- it's what most module writers + will see. The non-blocking mode is the "hard" mode, this is where + module writers wanting to squeeze out some speed will have to play. + In order to build async/sync hybrid models we need the + non-blocking i/o abstraction. + +- timed reads and writes (for blocking cases) + + This is part of my jihad against asynchronous notification. + +- i/o filtering or layering + + Yet another Holy Grail of computing. But I digress. These are + hard when you take into consideration non-blocking i/o -- you have + to keep lots of state. I expect our core filters will all support + non-blocking i/o, well at least the ones I need to make sure we kick + ass on benchmarks. A filter can deny a switch to non-blocking mode, + the server will have to recover gracefully (ha). + +- copy-avoidance + + Hey what about zero copy a la IO-Lite? After having experienced it + in a production setting I'm no longer convinced of its benefits. + There is an enormous amount of overhead keeping lists of buffers, + and reference counts, and cleanup functions, and such which requires + a lot of tuning to get right. I think there may be something here, + but it's not a cakewalk. + + What I do know is that the heuristics I put into apache-1.3 to choose + writev() at times are almost as good as what you can get from doing + full zero-copy in the cases we *currently* care about. To put it + another way, let's wait another generation to deal with zero copy. + + But sendfile/transmitfile/etc. those are still interesting. + + So instead of listing "zero copy" as a property, I'll list + "copy-avoidance". + +So far? + +- ap_bungetc added +- ap_blookc changed to return the character, rather than take a char *buff +- in theory, errno is always useful on return from a BUFF routine +- ap_bhalfduplex, B_SAFEREAD will be re-implemented using a layer I think +- chunking gone for now, will return as a layer +- ebcdic gone for now... it should be a layer + +- ap_iol.h defined, first crack at the layers... + + Step back a second to think on it. Much like we have fread(3) + and read(2), I've got a BUFF and an ap_iol abstraction. An ap_iol + could use a BUFF if it requires some form of buffering, but many + won't require buffering... or can do a better job themselves. + + Consider filters such as: + - ebcdic -> ascii + - encryption + - compression + These all share the property that no matter what, they're going to make + an extra copy of the data. In some cases they can do it in place (read) + or into a fixed buffer... in most cases their buffering requirements + are different than what BUFF offers. + + Consider a filter such as chunking. This could actually use the writev + method to get its job done... depends on the chunks being used. This + is where zero-copy would be really nice, but we can get by with a few + heuristics. + + At any rate -- the NSPR folks didn't see any reason to included a + buffered i/o abstraction on top of their layered i/o abstraction... so + I feel like I'm not the only one who's thinking this way. + +- iol_unix.c implemented... should hold us for a bit + + + + Index: ossp-pkg/sio/BRAINSTORM/asf-ideas.txt RCS File: /v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/asf-ideas.txt,v co -q -kk -p'1.1' '/v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/asf-ideas.txt,v' | diff -u /dev/null - -L'ossp-pkg/sio/BRAINSTORM/asf-ideas.txt' 2>/dev/null --- ossp-pkg/sio/BRAINSTORM/asf-ideas.txt +++ - 2024-05-20 02:18:02.344804387 +0200 @@ -0,0 +1,328 @@ +From manoj@io.com Wed Aug 16 21:38:14 2000 +Path: engelschall.com!mail2news!apache.org!new-httpd-return-7115-rse+apache=en.muc.de +From: manoj@io.com (Manoj Kasichainula) +Newsgroups: en.lists.apache-new-httpd +Subject: My proposal for buckets/filtering in 3.0. +Date: 14 Aug 2000 18:59:14 +0200 +Organization: Mail2News at engelschall.com +Lines: 137 +Approved: postmaster@m2ndom +Message-ID: <20000814050814.A16555@manojk.users.mindspring.com> +Reply-To: new-httpd@apache.org +NNTP-Posting-Host: en1.engelschall.com +X-Trace: en1.engelschall.com 966272354 36734 141.1.129.1 (14 Aug 2000 16:59:14 GMT) +X-Complaints-To: postmaster@engelschall.com +NNTP-Posting-Date: 14 Aug 2000 16:59:14 GMT +X-Mail2News-Gateway: mail2news.engelschall.com +Xref: engelschall.com en.lists.apache-new-httpd:41295 + +This is quite different than the other proposals. From the feedback I +got during the meeting, this isn't suitable for Apache 2.0 because +it's still not developed enough (it popped into my head *during* the +meeting after all) and it requires more rewriting of existing modules +than the current 2.0 filtering scheme. I agree with the sentiment that +says to get 2.0 out soon rather than adding more features, so I agree. + +Also, I have a bad feeling someone else already put forward this idea. +But, I'll send it out anyway. + +The main difference between this design and the others is that there +are more buckets and fewer filters. + +Filters are things that talk to the network, and they are two-way. SSL +and chunking are filters. They don't change very much in this case; as +is the case now, they are just iols that can handle buckets. + +Most everything else is a bucket in this case. Actually, they are +probably better described as matrushka dolls or those little nested +plastic barrels. In fact, I'll call them barrels to increase +confusion. + +Brigades don't exist in this scheme. They are just compound barrels. + +I think that each barrel will have a URI-identifier. This is useful +for caching. + +Content-manipulating filters in the other designs are barrels in this +one. Also, while they were "writers" in the other designs, they are +"readers" in this one. So instead of a push-style design where filters +are written to by other filters, this scheme has barrels reading from +other barrels. + +So here's how a request would proceed: + +- HTTP request comes in, munched on by various filters, and passed to + the request handler. + +- request handler picks out the URI that was requested, creates a URI + barrel initialized to that URI, and reads the content and metadata + out of it. + ("ap_create_uri_barrel("http://www.deedee.com/dancing/", + barrel_types)->read();") + +- The URI barrel, when created, figures out how the content should be + delivered, and creates subbarrels to deal with them. I'll give three + cases: + + Case #1: A file on the disk + Case #2: A cgi script that outputs postscript that gets interpreted + into a PNG image file + Case #3: a proxy request + + In case #1, the URI barrel figures out that it's accessing a file, + creates a filehandle barrel, and then binds its own content-handling + calls to the filehandle barrel. + + file -> uri -> HTTP handler + + In case #2, the URI barrel creates a CGI barrel that is initialized + with a file barrel pointing to the CGI script. Then uri_barrel + creates a postscript barrel initialized with the CGI barrel and the + parameter "PNG". The URI barrel then binds its content-handling + calls to the postscript barrel. + + CGI script file -> mod_cgi -> mod_postscript -> uri -> HTTP handler + + In case #3, the URI barrel figures out that this is a remote request + and creates an HTTP client barrel. The URI barrel then binds its + conent requst handler to the HTTP barrel. Maybe there needs to be an + intervening proxy barrel, or maybe the URI barrel knows proxy + semantics, or maybe the request handler needs smarts about HTTP + proxies. I'm not in touch with HTTP/1.1 proxying enough to know. + + HTTP client -> uri -> HTTP server handler + +- The request handler gets barrels back, and using the appropriate + barrel functions, writes their headers and data through the filters + back to the client. + +When anything creates a barrel, it passes in: + +- a pool to allocate memory from. The stuff read from the barrel must + be in the scope of that pool. +- a list of capabilities that are required, and a list of preferred + capabilities. capabilities include: "send-from-file-descriptor", + "send from memory", "write-to-content", and so on. + + This feels like content negotation, which scares me. But, very + little of it is actually necessary in a first pass. A single + "memory-block" capability is all that's mandatory, really. + Everything else is extra features and optimization. + +The barrel bring created will then attempt to meet those wishes if +possible, or return an error. + +This is really not mapped out well-enough, and needs code, which will +wait until Apache 3.0 development starts up. But, here are the cool +features I can imagine: + +- non-blocking I/O can be a capability of a barrel. One API of the + barrel would be to return the file descriptor (or "event") it is + waiting for, so that it can be selected on, and a full event-based + server should be possible. When some barrel three levels deep + doesn't support non-blocking I/O, the request handler can decide to + punt to a seperate thread. This way, different programming models + for modules can be supported. + +- Writability would be another capability. If a barrel is writable, + that URI is available to DAV. + +- Set-asides and lifetime of data aren't that much of an issue anymore + (or at least I haven't thought of a case where they are). Barrels + are naturally kept around only as long as they are needed, since + their scope is determined by the consumer of the barrel. Caching is + done with a cacher barrel that uses a large-scoped pool, for example. + +- This scheme allows not just chains of barrels, but trees. So half of + a document can come from PHP and half can come from SSI. You just + need a container barrel that knows what parts are interpreted by + what modules. There could be an MS-Word barrel that munches on + multiple HTTP requests for HTML + images. This scares me. + +- Because of the content-negotiation features, a barrel can always + figure out what the optimal format for sending content back should + be. + +- subrequests are really easy (thanks for the idea of URI barrels, + Ryan) + +- Should allow "magic cache" like was discussed back at the June/July + '98 meeting. In fact, it should be really easy. + +There are plenty of unanswered questions here, such as how metadata +will work, how exactly proxies fit into this, and how things like DAV +collections fit in. But I'm sleepy. + +From fielding@kiwi.ICS.UCI.EDU Wed Aug 16 21:39:15 2000 +Path: engelschall.com!mail2news!apache.org!new-httpd-return-7124-rse+apache=en.muc.de +From: fielding@kiwi.ICS.UCI.EDU ("Roy T. Fielding") +Newsgroups: en.lists.apache-new-httpd +Subject: Re: My proposal for buckets/filtering in 3.0. +Date: 15 Aug 2000 07:00:31 +0200 +Organization: Mail2News at engelschall.com +Lines: 155 +Approved: postmaster@m2ndom +Message-ID: <200008141352.aa13983@gremlin-relay.ics.uci.edu> +Reply-To: new-httpd@apache.org +NNTP-Posting-Host: en1.engelschall.com +X-Trace: en1.engelschall.com 966315631 57145 141.1.129.1 (15 Aug 2000 05:00:31 GMT) +X-Complaints-To: postmaster@engelschall.com +NNTP-Posting-Date: 15 Aug 2000 05:00:31 GMT +X-Mail2News-Gateway: mail2news.engelschall.com +Xref: engelschall.com en.lists.apache-new-httpd:41304 + +>Most everything else is a bucket in this case. Actually, they are +>probably better described as matrushka dolls or those little nested +>plastic barrels. In fact, I'll call them barrels to increase +>confusion. + +Hah, cute, but how is it different than our current subrequests? + +>I think that each barrel will have a URI-identifier. This is useful +>for caching. + +Yes. We would also need to name the handlers that define how a +barrel is constructed for a given request-URI. + +>Content-manipulating filters in the other designs are barrels in this +>one. Also, while they were "writers" in the other designs, they are +>"readers" in this one. So instead of a push-style design where filters +>are written to by other filters, this scheme has barrels reading from +>other barrels. +> +>So here's how a request would proceed: +> +>- HTTP request comes in, munched on by various filters, and passed to +> the request handler. +> +>- request handler picks out the URI that was requested, creates a URI +> barrel initialized to that URI, and reads the content and metadata +> out of it. +> ("ap_create_uri_barrel("http://www.deedee.com/dancing/", +> barrel_types)->read();") +> +>- The URI barrel, when created, figures out how the content should be +> delivered, and creates subbarrels to deal with them. I'll give three +> cases: +> +> Case #1: A file on the disk +> Case #2: A cgi script that outputs postscript that gets interpreted +> into a PNG image file +> Case #3: a proxy request +> +> In case #1, the URI barrel figures out that it's accessing a file, +> creates a filehandle barrel, and then binds its own content-handling +> calls to the filehandle barrel. +> +> file -> uri -> HTTP handler +> +> In case #2, the URI barrel creates a CGI barrel that is initialized +> with a file barrel pointing to the CGI script. Then uri_barrel +> creates a postscript barrel initialized with the CGI barrel and the +> parameter "PNG". The URI barrel then binds its content-handling +> calls to the postscript barrel. +> +> CGI script file -> mod_cgi -> mod_postscript -> uri -> HTTP handler +> +> In case #3, the URI barrel figures out that this is a remote request +> and creates an HTTP client barrel. The URI barrel then binds its +> conent requst handler to the HTTP barrel. Maybe there needs to be an +> intervening proxy barrel, or maybe the URI barrel knows proxy +> semantics, or maybe the request handler needs smarts about HTTP +> proxies. I'm not in touch with HTTP/1.1 proxying enough to know. +> +> HTTP client -> uri -> HTTP server handler +> +>- The request handler gets barrels back, and using the appropriate +> barrel functions, writes their headers and data through the filters +> back to the client. + +Yep, subrequests. + +>When anything creates a barrel, it passes in: +> +>- a pool to allocate memory from. The stuff read from the barrel must +> be in the scope of that pool. +>- a list of capabilities that are required, and a list of preferred +> capabilities. capabilities include: "send-from-file-descriptor", +> "send from memory", "write-to-content", and so on. +> +> This feels like content negotation, which scares me. But, very +> little of it is actually necessary in a first pass. A single +> "memory-block" capability is all that's mandatory, really. +> Everything else is extra features and optimization. + +Hmmm, sorry, I think I've heard that phrase one too many times this +past week. Figure out what the architectural context is -- all of the +forces that will impact this design in terms of the application needs. +When you have covered all of that, everything else is extra features +and optimization. Things like single-copy IO and sendfile support are +not optimizations -- they are the requirements that motivate our next +generation architecture. + +>The barrel bring created will then attempt to meet those wishes if +>possible, or return an error. + +Hmmm, that sounds like magic to me. The whole point of bucket brigades +was to specify that magic in a way that can be standard for all modules. + +>This is really not mapped out well-enough, and needs code, which will +>wait until Apache 3.0 development starts up. But, here are the cool +>features I can imagine: +> +>- non-blocking I/O can be a capability of a barrel. One API of the +> barrel would be to return the file descriptor (or "event") it is +> waiting for, so that it can be selected on, and a full event-based +> server should be possible. When some barrel three levels deep +> doesn't support non-blocking I/O, the request handler can decide to +> punt to a seperate thread. This way, different programming models +> for modules can be supported. + +It does mean that something has to read from the barrel and write to +the network, right? Or is this a model where we give the network to +the barrel and it writes? Kind of hard top manage the latter. + +>- Writability would be another capability. If a barrel is writable, +> that URI is available to DAV. + +Yes, all source resources should be available to DAV. The way to do that +is to asign them URI and pass their identifiers as metadata. Let the +protocol filter decide what to do with that information. + +>- Set-asides and lifetime of data aren't that much of an issue anymore +> (or at least I haven't thought of a case where they are). Barrels +> are naturally kept around only as long as they are needed, since +> their scope is determined by the consumer of the barrel. Caching is +> done with a cacher barrel that uses a large-scoped pool, for example. + +Right, just like the 1.3 subrequest architecture. + +>- This scheme allows not just chains of barrels, but trees. So half of +> a document can come from PHP and half can come from SSI. You just +> need a container barrel that knows what parts are interpreted by +> what modules. There could be an MS-Word barrel that munches on +> multiple HTTP requests for HTML + images. This scares me. + +It should. Keep in mind that all of the sources would have to pass +through the access control steps. + +>- Because of the content-negotiation features, a barrel can always +> figure out what the optimal format for sending content back should +> be. + +You mean every barrel will have to know everything about the request, +including things like HTTP negotiation? Yikes. + +>- subrequests are really easy (thanks for the idea of URI barrels, +> Ryan) + +I don't see much difference from 1.3 subrequests. + +>- Should allow "magic cache" like was discussed back at the June/July +> '98 meeting. In fact, it should be really easy. + +Easy to identify cacheable items, yes, but how easy is it for the +cache manager to manage overall allocations and reap old entries? + +....Roy + Index: ossp-pkg/sio/BRAINSTORM/brustoloni-abs.txt RCS File: /v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/brustoloni-abs.txt,v co -q -kk -p'1.1' '/v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/brustoloni-abs.txt,v' | diff -u /dev/null - -L'ossp-pkg/sio/BRAINSTORM/brustoloni-abs.txt' 2>/dev/null --- ossp-pkg/sio/BRAINSTORM/brustoloni-abs.txt +++ - 2024-05-20 02:18:02.347478772 +0200 @@ -0,0 +1,141 @@ +Copy avoidance in networked file systems +---------------------------------------- + + Jose' Carlos Brustoloni + Bell Labs, Lucent Technologies + jcb@research.bell-labs.com + +The point-to-point bandwidth of gigabit networks can surpass +the main memory copy bandwidth of many current hosts. Therefore, +researchers have been devoting considerable attention to the problem +of copy avoidance in network I/O. In particular, a recent study shows +that copying can be avoided without modifying the semantics of existing +networking APIs [1]. + +In contrast, far less attention has been recently devoted to copy +avoidance in file I/O. This neglect may be motivated by several +subtle misperceptions: + +1) Disks are far slower than main memory (or gigabit networks). + + This is indeed true, but copy avoidance can still be worthwhile + in file I/O because: + (a) copy avoidance can reduce CPU utilization and significantly + improve the throughput of file servers, which often are CPU-bound, and + (b) caching can avoid physical disk I/O and greatly speed up file systems. + +2) Many copy avoidance techniques for network I/O are not useful in file I/O. + + Indeed, copy avoidance techniques for network I/O + often exploit the fact that buffers are ephemeral, i.e. are + deallocated as soon as processing of the corresponding input or + output request completes. On the contrary, buffers used in file I/O + often are cached. For example, emulated copy [1] is a copy avoidance + scheme for network I/O that uses input alignment and page swapping + on input and TCOW, a form of copy-on-write, on output. If used on + file input, page swapping would corrupt the file system cache + with the previous contents of the client input buffers. + If used on file output, TCOW would allow cached output pages to + be corrupted because, after output completes, the output reference + is lost and therefore the pages can be overwritten or reused. + + This does not mean, however, that copy avoidance is unattainable + in file I/O. Systems usually also offer mapped file I/O, + which allows file data to be passed between applications and the + operating system by page mapping and unmapping. Mapped files are a practical + solution that is already widely available for copy avoidance in file I/O. + +3) Copying between mapped files and network I/O buffers can be unavoidable + because of page alignment constraints. + + For example, in a networked file server, data may be received from + the network for output to the file system. The data will usually be + preceded by an application-layer header specifying the file and + offset from the beginning of the file (for simplicity, let us assume + that the offset is multiple of the page size). This header can make + copy avoidance difficult because (a) the application must read the + header to determine the file and (b) the header may make the following data + arbitrarily aligned, whereas data must be page-aligned for mapped file I/O. + + However, I show that: + + (a) If the network adapter supports system-aligned buffering + (early demultiplexing or buffer snap-off) [2], then the application + can peek at the header and, after decoding it, input the data + directly to the correct mapped file region, using emulated copy. + Data is passed between network and file system with copy avoidance + and without any modifications in existing APIs. + + (b) Even without such adapter support, copy avoidance is possible with + header patching, a novel software optimization. + + Let h' be the preferred alignment for input from the network (usually + equal to the length of any unstripped protocol headers below the + application layer), h be the length of the application-layer header, + and l be the data length (less than or equal to the network's + maximum transmission unit minus the lengths of headers at network + or higher layers). h' must be fixed and known by both sender and + receiver. On the contrary, h and l may vary from packet to packet. + Using header patching, the sender transmits the application-layer + header followed by the data starting at file offset o + h' + h + and of length l - h' - h, followed by data starting at file offset + o and of length h' + h (to achieve this out-of-order transmission, + the sender may use, e.g., Unix's writev call with a gather list). + The receiver peeks at the first h bytes of the input + (using, e.g., Unix's recv with MSG_PEEK flag), + decodes the application-layer header, and determines the address a + corresponding to file offset o (multiple of the page size) + in the correct mapped file region. The receiver then inputs l - h' + bytes to address a + h', followed by h' + h bytes to address a. + This causes most of the data to be passed by page swapping, after which + the data corresponding to offset o and of length h' + h is + patched on top of the application- and lower-layer + headers at address a. After patching, the input buffer starts at + the correct offset in the mapped file region and runs uninterrupted + for length l with the data in correct order, as illustrated by the + following figure. + + +----+---+----------+----+ + Packet: | h' | h | d1 | d0 | + +----+---+----------+----+ + + Pooled NW +----+---+----------+ +----+--------------+ + buffers: | h' | h | d1 | | d0 | | + +----+---+----------+ +----+--------------+ + | | + reverse | ^ | ^ | + copyout | | | | swap | + | | v | + Mapped +----+---+----------+ | + file: | | h | d1 | | + +----+---+----------+ | + \--------/ | + ^ | + | | + patch +-----------------------+ + +My experiments on the Credit Net ATM network at 512 Mbps show that +copy avoidance can substantially improve the performance of networked +file systems. Because of cache effects, copy avoidance benefits are +synergistic: Greatest benefits are obtained when copying is avoided on the +entire end-to-end data path, including network and file I/O. +Additionally, the experiments confirm each of the above claims. + +References +---------- + +[1] J. Brustoloni and P. Steenkiste. ``Effects of Buffering + Semantics on I/O performance'', in Proc. OSDI'96, + USENIX, Oct. 1996, pp. 277-291. Also available from + http://www.cs.cmu.edu/~jcb/. + +[2] J. Brustoloni and P. Steenkiste. ``Copy Emulation in + Checksummed, Multiple-Packet Communication'', in + Proc. INFOCOM'97, IEEE, April 1997. Also available from + http://www.cs.cmu.edu/~jcb/. + +--------------------------------------------------------------------------- + +Work performed while at the School of Computer Science, +Carnegie Mellon University. +To be presented at Gigabit Networking Workshop - GBN'98. Index: ossp-pkg/sio/BRAINSTORM/brustoloni-abs.txt.L RCS File: /v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/brustoloni-abs.txt.L,v co -q -kk -p'1.1' '/v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/brustoloni-abs.txt.L,v' | diff -u /dev/null - -L'ossp-pkg/sio/BRAINSTORM/brustoloni-abs.txt.L' 2>/dev/null --- ossp-pkg/sio/BRAINSTORM/brustoloni-abs.txt.L +++ - 2024-05-20 02:18:02.350046060 +0200 @@ -0,0 +1 @@ +http://www.ccrc.wustl.edu/pub/ieee-tcgn/conference/gbn98/brustoloni-abs.html Index: ossp-pkg/sio/BRAINSTORM/c10k.html RCS File: /v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/c10k.html,v co -q -kk -p'1.1' '/v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/c10k.html,v' | diff -u /dev/null - -L'ossp-pkg/sio/BRAINSTORM/c10k.html' 2>/dev/null --- ossp-pkg/sio/BRAINSTORM/c10k.html +++ - 2024-05-20 02:18:02.352539169 +0200 @@ -0,0 +1,281 @@ + + +The C10K problem + + +

The C10K problem

+It's time for web servers to handle ten thousand clients simultaneously, +don't you think? After all, the web is a big place now. +

+And computers are big, too. You can buy a 500MHz machine +with 1 gigabyte of RAM and six 100Mbit/sec Ethernet card for $3000 or so. +Let's see - at 10000 clients, that's +50KHz, 100Kbytes, and 60Kbits/sec per client. +It shouldn't take any more horsepower than that to take four kilobytes +from the disk and send them to the network once a second for each +of ten thousand clients. +(That works out to $0.30 per client, by the way. Those +$100/client licensing fees some operating systems charge are starting to +look a little heavy!) So hardware is no longer the bottleneck. +

+One of the busiest ftp sites, ftp.cdrom.com, serves (as of May 1999) +around 5000 clients simultaneously through a 70 megabit/second pipe. +Pipes this fast aren't common yet, but technology is improving rapidly. +

+With that in mind, here are a few notes on how to configure operating +systems and write code to support thousands of clients. The discussion +centers around Unix-like operating systems, for obvious reasons. + +

I/O Strategies

+There seem to be four ways of writing a fast web server to handle many +clients: +
    +
  1. serve many clients with each server process or thread, and use +select() or poll() to avoid blocking. This is the traditional favorite, +and is sometimes referred to as "using nonblocking I/O". +
  2. serve many clients with each server process or thread, and use +asynchronous I/O to avoid blocking. This has not yet become popular, +possibly because of poorly designed asynchronous I/O interfaces. +Zach Brown (author of HoserFTPD) thinks this might now be the way to go +for highest performance; see +his 14 April 1999 post to hftpd-users. +
    +There are several flavors of asynchronous I/O: +
      +
    • the aio_ interface +(scroll down from that link to "Asynchronous input and output"), +which associates a signal and value with each I/O operation. +Signals and their values are queued and delivered efficiently to the user process. +This is from the POSIX 1003.1b realtime extensions, and is also in the Single Unix Specification, +version 2, and in glibc 2.1. +
    • +F_SETSIG +(see also here), +which associates a signal with each file descriptor. +When a normal I/O function like read() or write() completes, the signal +is raised, with the file descriptor as an argument. Similar to aio_ but +without the new calls, and slightly less flexible (you know the handle, +but you don't know whether it is ready for read() or for write() without +doing a poll() on it). +(Currently only in Linux, I think.) +
    • SIGIO (see glibc doc or +BSD Sockets doc) +-- doesn't tell you which handle needs servicing, so it seems kind of coarse. +Used by the Linux F_SETSIG/aio_ implementation as a fallback when the realtime signal +queue overflows. +Here's +an example of its use. (Was partly broken in Linux kernels 2.2.0 - 2.2.7, fixed in 2.2.8.) +
    +
  3. serve one client with each server thread, and let read() and write() +block. (This is the only model supported by Java.) +
  4. Build the server code into the kernel. +Novell and Microsoft are both said to have done this at various times, and at least one NFS implementation does this. +IBM and Sun are said to have released specweb benchmark results using this technique. +
+

+Richard Gooch has written a paper discussing these options. Interesting reading. +

+The Apache mailing lists have some interesting posts +(one, +two, +three) +about why they prefer not to use select() (basically, they think that makes plugins harder). +
+I have not yet seen any data comparing the performance of the four approaches. +

+Mark Russinovich wrote +an editorial and +an article +discussing I/O strategy issues in the 2.2 Linux kernel. Worth reading, even +he seems misinformed on some points. In particular, he +seems to think that Linux 2.2's asyncrhonous I/O +(see F_SETSIG above) doesn't notify the user process when data is ready, only +when new connections arrive. This seems like a bizarre misunderstanding. +See also +comments on an earlier draft, +a rebuttal from Mingo, +Russinovich's comments of 2 May 1999, +a rebuttal from Alan Cox, +and various +posts to linux-kernel. + +

Limits on open filehandles

+ + +

Limits on threads

+
    +
  • Solaris: it supports as many threads as will fit in memory, I hear. +
  • FreeBSD: ? +
  • Linux: Even the 2.2.2 kernel limits the number of threads, +at least on Intel. I don't know what the limits are on other architectures. +Mingo posted a patch +for 2.1.131 on Intel that removed this limit; I hear he intends to +provide updated patches as time goes on, until it's time to integrate it +into the main version of the kernel. +
  • Java: See the +Volanomark benchmark writeup in Javaworld. It recommends reducing +the amount of memory reserved by default for each thread. +
+ +

Other limits/tips

+
    +
  • select() is limited to FD_SETSIZE handles. This limit is compiled in to +the standard library and user programs. The similar call poll() does not have +a comparable limit, and can have less overhead than select(). +
  • Even the most recent glibc might use 16 bit variables to hold thread +or file handles, which could cause trouble above 32767 handles/threads. +
  • Too much thread-local memory is preallocated by some operating systems; +if each thread gets 1MB, and total VM space is 2GB, that creates an upper limit +of 2000 threads. +
  • Normally, data gets copied many times on its way from here to there. +mmap() and sendfile() can be used to reduce this overhead in some cases. +IO-Lite +is a proposal (already implemented on FreeBSD) for a set of +I/O primitives that gets rid of the need for many copies. +It's sexy; go read it. But see also Alan Cox's opinion of zero-copy. +
  • The sendfile() function in Linux and FreeBSD lets you tell the kernel to send part +or all of a file. This lets the OS do it as efficiently as possible. +It can be used equally well in servers using threads or servers using +nonblocking I/O. (In Linux, It's poorly documented at the moment; use _syscall4 to +call it. Andi Kleen is writing new man pages that cover this.) +Rumor has it, +ftp.cdrom.com benefitted noticably from sendfile(). +
  • A new socket option under Linux, TCP_CORK, tells the kernel to +avoid sending partial frames, which helps a bit e.g. when there are +lots of little write() calls you can't bundle together for some reason. +Unsetting the option flushes the buffer. +
  • Not all threads are created equal. The clone() function in Linux +(and its friends in other operating systems) +lets you create a thread that has its own current working directory, +for instance, which can be very helpful when implementing an ftp server. +See Hoser FTPd for an example of the use of native threads rather than pthreads. +
  • To keep the number of filehandles per process down, servers +can fork() once they reach the desired maximum; the child +finishes serving the existing clients, and the parent accepts and +services new clients. (If the desired maximum is 1, this degenerates +to the classical one-process-per-client model.) +
  • One developer using sendfile() with Freebsd reports that using +POLLWRBAND instead of POLLOUT makes a big difference. +
  • Look at the performance comparison graph at the bottom of +http://www.acme.com/software/thttpd/benchmarks.html. +Notice how various servers have trouble above 128 connections, even on Solaris 2.6? +Anyone who figures out why, let me know. +
    +Note: if the TCP stack has a bug that causes a short (200ms) +delay at SYN or FIN time, as Linux 2.2.0-2.2.6 had, and the OS or +http daemon has a hard limit on the number of connections open, +you would expect exactly this behavior. There may be other causes. +
  • "Re: fix for hybrid server problems" by Vivek Sadananda Pai +(vivek@cs.rice.edu) on +new-httpd, May 9th, notes: +
    +"I've compared the raw performance of a select-based server with a +multiple-process server on both FreeBSD and Solaris/x86. On +microbenchmarks, there's only a marginal difference in performance +stemming from the software architecture. The big performance win for +select-based servers stems from doing application-level caching. While +multiple-process servers can do it at a higher cost, it's harder to +get the same benefits on real workloads (vs microbenchmarks). +I'll be presenting those measurements as part of a paper that'll +appear at the next Usenix conference. If you've got postscript, +the paper is available at +http://www.cs.rice.edu/~vivek/flash99/" +
    +
+ +

Kernel Issues

+For Linux, it looks like kernel bottlenecks are being fixed constantly. +See Linux HQ, +Kernel Traffic, +and the Linux-Kernel mailing list +(Example interesting posts by +a user asking how to tune, and +Dean Gaudet) +

+In March 1999, Microsoft sponsored a benchmark comparing NT to Linux +at serving large numbers of http and smb clients, in which they +failed to see good results from Linux. +See also my article on Mindcraft's April 1999 Benchmarks +for more info. +

+See also The Linux Scalability Project. + +

Measuring Server Performance

+Two tests in particular are simple, interesting, and hard: +
    +
  1. raw connections per second (how many 512 byte files per second can you +serve?) +
  2. total transfer rate on large files with many slow clients +(how many 28.8k modem clients can simultaneously download +from your server before performance goes to pot?) +
+Jef Poskanzer has published benchmarks comparing many web servers. +See http://www.acme.com/software/thttpd/benchmarks.html +for his results. +

+I also have + +a few old notes about comparing thttpd to Apache that may be of interest +to beginners. + +

Interesting select()-based servers

+
    +
  • thttpd +Very simple. Uses a single process. It has good performance, +but doesn't scale with the number of CPU's. +
  • mathopd. Similar to thttpd. +
  • Zeus, a commercial server that tries to be the absolute fastest. +See their tuning guide. +
  • The other non-Java servers listed at http://www.acme.com/software/thttpd/benchmarks.html +
  • BetaFTPd +
  • Flash-Lite - +web server using IO-Lite. +
  • xitami - uses select() to +implement its own thread abstraction for portability to systems without +threads. +
  • Medusa - a server-writing toolkit in Python that tries to deliver very high performance. +
+ +

Interesting thread-based servers

+ + +

Other interesting servers

+ + +
+Copyright 1999 Dan Kegel
+dank@alumni.caltech.edu
+Last updated: 9 May 1999
+[Return to www.kegel.com] + + + Index: ossp-pkg/sio/BRAINSTORM/c10k.html.L RCS File: /v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/c10k.html.L,v co -q -kk -p'1.1' '/v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/c10k.html.L,v' | diff -u /dev/null - -L'ossp-pkg/sio/BRAINSTORM/c10k.html.L' 2>/dev/null --- ossp-pkg/sio/BRAINSTORM/c10k.html.L +++ - 2024-05-20 02:18:02.355239924 +0200 @@ -0,0 +1 @@ +http://www.kegel.com/c10k.html Index: ossp-pkg/sio/BRAINSTORM/c10k.ps.gz RCS File: /v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/c10k.ps.gz,v co -q -kk -p'1.1' '/v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/c10k.ps.gz,v' | diff -u /dev/null - -L'ossp-pkg/sio/BRAINSTORM/c10k.ps.gz' 2>/dev/null Binary files ossp-pkg/sio/BRAINSTORM/c10k.ps.gz and - differ Index: ossp-pkg/sio/BRAINSTORM/copyavoid.pdf RCS File: /v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/copyavoid.pdf,v co -q -kk -p'1.1' '/v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/copyavoid.pdf,v' | diff -u /dev/null - -L'ossp-pkg/sio/BRAINSTORM/copyavoid.pdf' 2>/dev/null Binary files ossp-pkg/sio/BRAINSTORM/copyavoid.pdf and - differ Index: ossp-pkg/sio/BRAINSTORM/copyavoid.pdf.L RCS File: /v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/copyavoid.pdf.L,v co -q -kk -p'1.1' '/v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/copyavoid.pdf.L,v' | diff -u /dev/null - -L'ossp-pkg/sio/BRAINSTORM/copyavoid.pdf.L' 2>/dev/null --- ossp-pkg/sio/BRAINSTORM/copyavoid.pdf.L +++ - 2024-05-20 02:18:02.362789887 +0200 @@ -0,0 +1 @@ +ftp://ftp.cup.hp.com/dist/networking/briefs/copyavoid.pdf Index: ossp-pkg/sio/BRAINSTORM/copyavoid.ps.gz RCS File: /v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/copyavoid.ps.gz,v co -q -kk -p'1.1' '/v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/copyavoid.ps.gz,v' | diff -u /dev/null - -L'ossp-pkg/sio/BRAINSTORM/copyavoid.ps.gz' 2>/dev/null Binary files ossp-pkg/sio/BRAINSTORM/copyavoid.ps.gz and - differ Index: ossp-pkg/sio/BRAINSTORM/dean.txt RCS File: /v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/dean.txt,v co -q -kk -p'1.1' '/v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/dean.txt,v' | diff -u /dev/null - -L'ossp-pkg/sio/BRAINSTORM/dean.txt' 2>/dev/null --- ossp-pkg/sio/BRAINSTORM/dean.txt +++ - 2024-05-20 02:18:02.367778779 +0200 @@ -0,0 +1,1191 @@ +From dgaudet-list-new-httpd@arctic.org Mon Mar 13 11:11:15 2000 +Path: engelschall.com!mail2news!apache.org!new-httpd-return-660-rse+apache=en.muc.de +From: dgaudet-list-new-httpd@arctic.org (Dean Gaudet) +Newsgroups: en.lists.apache-new-httpd +Subject: Re: Buff should be an I/O layer +Date: 12 Mar 2000 19:24:28 +0100 +Organization: Mail2News at engelschall.com +Lines: 41 +Approved: postmaster@m2ndom +Message-ID: +Reply-To: new-httpd@apache.org +NNTP-Posting-Host: en1.engelschall.com +X-Trace: en1.engelschall.com 952885468 17709 141.1.129.1 (12 Mar 2000 18:24:28 GMT) +X-Complaints-To: postmaster@engelschall.com +NNTP-Posting-Date: 12 Mar 2000 18:24:28 GMT +X-Mail2News-Gateway: mail2news.engelschall.com +Xref: engelschall.com en.lists.apache-new-httpd:36319 + +On Fri, 10 Mar 2000, Manoj Kasichainula wrote: + +> Random thought that I thought should go to the list before I forget +> it: +> +> BUFF's API should be redone to look exactly like an IOL. As far as the +> rest of Apache code is concerned, a IOL with a buff around it +> should look just like any other IOL. + +haha! + +if you figure this out, then rad. i tried for a long time to figure out a +clean way to do this which doesn't suck and i never found one. + +remember it is totally unacceptable for bputc() and bgetc() to be anything +other than macros operating directly on the buffer. + +> First of all, we get more uniformity in the API. That's always a good +> thing. This also allows us to yank out the buff IOL sometimes. I can +> see this being useful if a really sophisticated module wants to truly +> eliminate the buffering between it and the client. + +the BUFF layer allows you to run without buffering, or with non-blocking +i/o, and it implements chunking. you gain a mere few cycles by "yanking +it out". + +you might say "chunking should be a layer too", see my above laughter. + +the only way you're going to make a change like this "clean" is to add a +sophisticated zero-copy implementation. i stopped short of doing this +because in my experience using one of these, the benefits are really +minimal. look in libstash. + +the only case where i've seen the zero-copy stuff really shine is when +doing TCP-to-TCP proxying. for everything else, the (comparatively +simple) heuristics present in BUFF are all that's required. + +Dean + + + +From dgaudet-list-new-httpd@arctic.org Mon Mar 13 11:11:44 2000 +Path: engelschall.com!mail2news!apache.org!new-httpd-return-661-rse+apache=en.muc.de +From: dgaudet-list-new-httpd@arctic.org (Dean Gaudet) +Newsgroups: en.lists.apache-new-httpd +Subject: Re: Buff should be an I/O layer +Date: 12 Mar 2000 19:24:30 +0100 +Organization: Mail2News at engelschall.com +Lines: 41 +Approved: postmaster@m2ndom +Message-ID: +Reply-To: new-httpd@apache.org +NNTP-Posting-Host: en1.engelschall.com +X-Trace: en1.engelschall.com 952885470 17739 141.1.129.1 (12 Mar 2000 18:24:30 GMT) +X-Complaints-To: postmaster@engelschall.com +NNTP-Posting-Date: 12 Mar 2000 18:24:30 GMT +X-Mail2News-Gateway: mail2news.engelschall.com +Xref: engelschall.com en.lists.apache-new-httpd:36320 + + + +On Sun, 12 Mar 2000, Dean Gaudet wrote: + +> the only way you're going to make a change like this "clean" is to add a + +i should clarify this... the only *portable* way is blah blah blah. + +if all unixes supported TCP_CORK, and had very inexpensive syscall +overhead like linux does then we wouldn't have to do much work at all -- +we could take advantage of the fact that the kernel generally has to do a +single copy of all the bytes anyhow. + +TCP_CORK, for those not aware of it, is a very much needed correction to +the TCP API. specifically, the traditional API gives us the two options: + +- nagle on: the kernel makes somewhat arbitrary decisions as to where to +form your TCP packet boundaries, you might get lucky and two small writes +will be combined into one packet... or you might get unlucky and your +packets will be delayed causing performance degredation. + +- nagle off: each write() can, and usually does, cause a packet boundary. + +it's pretty much the case that no matter which option you choose it +results in performance degredation. + +with TCP_CORK the kernel is permitted to send any complete frames, but +can't send any final partial frames until the cork is removed. this lets +user applications use write(), which is *far* more natural to use than +writev() ... + +writev() is essentially an optimisation for kernels with expensive syscall +overhead :) + +i think TCP_CORK is still unique to linux. they added it when they were +implementing sendfile() and i pointed out the packet boundary problems and +asked for this new api... most other sendfile() implementations are +kludges bundling up sendfile and writev for the headers and trailers. + +Dean + +From dgaudet-list-new-httpd@arctic.org Tue Apr 11 08:16:13 2000 +Path: engelschall.com!mail2news!apache.org!new-httpd-return-2246-rse+apache=en.muc.de +From: dgaudet-list-new-httpd@arctic.org (dean gaudet) +Newsgroups: en.lists.apache-new-httpd +Subject: SAFEREAD (was Re: Buff should be an I/O layer) +Date: 11 Apr 2000 07:12:07 +0200 +Organization: Mail2News at engelschall.com +Lines: 31 +Approved: postmaster@m2ndom +Message-ID: +Reply-To: new-httpd@apache.org +NNTP-Posting-Host: en1.engelschall.com +X-Trace: en1.engelschall.com 955429927 94666 141.1.129.1 (11 Apr 2000 05:12:07 GMT) +X-Complaints-To: postmaster@engelschall.com +NNTP-Posting-Date: 11 Apr 2000 05:12:07 GMT +X-Mail2News-Gateway: mail2news.engelschall.com +Xref: engelschall.com en.lists.apache-new-httpd:37614 + +On Mon, 10 Apr 2000, dean gaudet wrote: + +> if you do that against apache-1.3 and the current apache-2.0 you get +> back maximally packed packets. + +heh, no 2.0 is broken. i broke SAFEREAD during the initial mpm work -- +and it hasn't been re-implemented yet. + +does someone else want to fix this? it's probably not ideal that i'm the +only person intimately familiar with this code :) + +without SAFEREAD, we end up with a packet boundary between every response +in a pipelined connection. + +essentially saferead ensures that if we are going to have to block in +read() to get the next request on a connection then we better flush our +output buffer (otherwise we cause a deadlock with non-pipelining clients). +but if there are more bytes available, then we don't need to flush our +output buffer. + +if you search the code for SAFEREAD you'll see i suggest that it might be +implemented as a layer. i'm not sure what i meant, i don't think this +works. + +if you look at 1.3's saferead and bhalfduplex in buff.c you'll see that we +use select() to implement it. naturally this won't work in 2.0... but +what will work is to set the iol's timeout to 0 and attempt a read. +underneath the covers this achieves the same result. + +-dean + +From dgaudet-list-new-httpd@arctic.org Thu Apr 13 14:10:41 2000 +Path: engelschall.com!mail2news!apache.org!new-httpd-return-2263-rse+apache=en.muc.de +From: dgaudet-list-new-httpd@arctic.org (dean gaudet) +Newsgroups: en.lists.apache-new-httpd +Subject: Re: question about the STATUS entry for lingering close +Date: 11 Apr 2000 18:35:53 +0200 +Organization: Mail2News at engelschall.com +Lines: 74 +Approved: postmaster@m2ndom +Message-ID: +Reply-To: new-httpd@apache.org +NNTP-Posting-Host: en1.engelschall.com +X-Trace: en1.engelschall.com 955470953 15153 141.1.129.1 (11 Apr 2000 16:35:53 GMT) +X-Complaints-To: postmaster@engelschall.com +NNTP-Posting-Date: 11 Apr 2000 16:35:53 GMT +X-Mail2News-Gateway: mail2news.engelschall.com +Xref: engelschall.com en.lists.apache-new-httpd:37624 + +On Tue, 11 Apr 2000, Jeff Trawick wrote: + +> > * Fix lingering close +> > Status: +> +> Does 2.0 regress one or more of the solutions in 1.3, or was some +> improvement (other than async I/O) envisioned? + +this actually isn't about implementing SO_LINGER stuff... that should be +avoided at all costs -- SO_LINGER works on very few kernels. actually +i'm tempted to say rip out all the SO_LINGER stuff. + +we need lingering_close() re-implemented, it's in main/http_connection.c. + +to do that we need to add a shutdown() method to ap_iol_methods, and i +suggest an ap_bshutdown() added to BUFF. + +and then lingering_close() needs to be re-implemented something like +the code below. + +-dean + +/* we now proceed to read from the client until we get EOF, or until + * MAX_SECS_TO_LINGER has passed. the reasons for doing this are + * documented in a draft: + * + * http://www.ics.uci.edu/pub/ietf/http/draft-ietf-http-connection-00.txt + * + * in a nutshell -- if we don't make this effort we risk causing + * TCP RST packets to be sent which can tear down a connection before + * all the response data has been sent to the client. + */ + +static void lingering_close(request_rec *r) +{ + char dummybuf[IOBUFFERSIZE]; + ap_time_t start; + ap_ssize_t nbytes; + ap_status_t rc; + int timeout; + + /* Send any leftover data to the client, but never try to again */ + + if (ap_bflush(r->connection->client) != APR_SUCCESS) { + ap_bclose(r->connection->client); + return; + } + /* XXX: hrm, setting B_EOUT should probably be part of ap_bshutdown() */ + ap_bsetflag(r->connection->client, B_EOUT, 1); + + if (ap_bshutdown(r->connection->client, 1) != APR_SUCCESS + || ap_is_aborted(r->connection)) { + ap_bclose(r->connection->client); + return; + } + + start = ap_now(); + timeout = MAX_SECS_TO_LINGER; + for (;;) { + ap_bsetopt(r->connection->client, BO_TIMEOUT, &timeout); + rc = ap_bread(r->connection->client, dummybuf, sizeof(dummybuf), &nbytes); + if (rc != APR_SUCCESS) break; + + /* how much time has elapsed? */ + timeout = (ap_now() - start) / AP_USEC_PER_SEC; + if (timeout >= MAX_SECS_TO_LINGER) break; + + /* figure out the new timeout */ + timeout = MAX_SECS_TO_LINGER - timeout; + } + + ap_bclose(r->connection->client); +} + +From dgaudet-list-new-httpd@arctic.org Tue Mar 28 11:34:53 2000 +Path: engelschall.com!mail2news!apache.org!new-httpd-return-1510-rse+apache=en.muc.de +From: dgaudet-list-new-httpd@arctic.org (Dean Gaudet) +Newsgroups: en.lists.apache-new-httpd +Subject: canonical list of i/o layering use cases +Date: 28 Mar 2000 07:04:07 +0200 +Organization: Mail2News at engelschall.com +Lines: 93 +Approved: postmaster@m2ndom +Message-ID: +Reply-To: new-httpd@apache.org +NNTP-Posting-Host: en1.engelschall.com +X-Trace: en1.engelschall.com 954219847 46303 141.1.129.1 (28 Mar 2000 05:04:07 GMT) +X-Complaints-To: postmaster@engelschall.com +NNTP-Posting-Date: 28 Mar 2000 05:04:07 GMT +X-Mail2News-Gateway: mail2news.engelschall.com +Xref: engelschall.com en.lists.apache-new-httpd:37027 + +i really hope this helps this discussion move forward. + +the following is the list of all applications i know of which have been +proposed to benefit from i/o layering. + +- data sink abstractions: + - memory destination (for ipc; for caching; or even for abstracting + things such as strings, which can be treated as an i/o + object) + - pipe/socket destination + - portability variations on the above + +- data source abstraction, such as: + - file source (includes proxy caching) + - memory source (includes most dynamic content generation) + - network source (TCP-to-TCP proxying) + - database source (which is probably, under the covers, something like + a memory source mapped from the db process on the same box, + or from a network source on another box) + - portability variations in the above sources + +- filters: + - encryption + - translation (ebcdic, unicode) + - compression + - chunking + - MUX + - mod_include et al + +and here are some of my thoughts on trying to further quantify filters: + +a filter separates two layers and is both a sink and a source. a +filter takes an input stream of bytes OOOO... and generates an +output stream of bytes which can be broken into blocks such +as: + + OOO NNN O NNNNN ... + + where O = an old or original byte copied from the input + and N = a new byte generated by the filter + +for each filter we can calculate a quantity i'll call the copied-content +ratio, or CCR: + + nbytes_old / nbytes_new + +where: + nbytes_old = number of bytes in the output of the + filter which are copied from the input + (in zero-copy this would mean "copy by + reference counting an input buffer") + nbytes_new = number of bytes which are generated + by the filter which weren't present in the + input + +examples: + +CCR = infinity: who cares -- straight through with no + transformation. the filter shouldn't even be there. + +CCR = 0: encryption, translation (ebcdic, unicode), compression. + these get zero benefit from zero-copy. + +CCR > 0: chunking, MUX, mod_include + +from the point of view of evaluating the benefit of zero-copy we only +care about filters with CCR > 0 -- because CCR = 0 cases degenerate into +a single-copy scheme anyhow. + +it is worth noting that the large_write heuristic in BUFF fairly +clearly handles zero-copy at very little overhead for CCRs larger than +DEFAULT_BUFSIZE. + +what needs further quantification is what the CCR of mod_include would +be. + +for a particular zero-copy implementation we can find some threshold k +where filters with CCRs >= k are faster with the zero-copy implementation +and CCRs < k are slower... faster/slower as compared to a baseline +implementation such as the existing BUFF. + +it's my opinion that when you consider the data sources listed above, and +the filters listed above that *in general* the existing BUFF heuristics +are faster than a complete zero-copy implementation. + +you might ask how does this jive with published research such as the +IO-Lite stuff? well, when it comes right down to it, the research in +the IO-Lite papers deal with very large CCRs and contrast them against +a naive buffering implementation such as stdio -- they don't consider +what a few heuristics such as apache's BUFF can do. + +Dean + +From dgaudet-list-new-httpd@arctic.org Tue Mar 28 11:35:26 2000 +Path: engelschall.com!mail2news!apache.org!new-httpd-return-1524-rse+apache=en.muc.de +From: dgaudet-list-new-httpd@arctic.org (Dean Gaudet) +Newsgroups: en.lists.apache-new-httpd +Subject: Re: canonical list of i/o layering use cases +Date: 28 Mar 2000 07:04:14 +0200 +Organization: Mail2News at engelschall.com +Lines: 15 +Approved: postmaster@m2ndom +Message-ID: +Reply-To: new-httpd@apache.org +NNTP-Posting-Host: en1.engelschall.com +X-Trace: en1.engelschall.com 954219854 46403 141.1.129.1 (28 Mar 2000 05:04:14 GMT) +X-Complaints-To: postmaster@engelschall.com +NNTP-Posting-Date: 28 Mar 2000 05:04:14 GMT +X-Mail2News-Gateway: mail2news.engelschall.com +Xref: engelschall.com en.lists.apache-new-httpd:37035 + +On Mon, 27 Mar 2000, Dean Gaudet wrote: + +> CCR = infinity: who cares -- straight through with no +> transformation. the filter shouldn't even be there. + +thanks to ronald for pointing out CCR = infinity filters -- hash +calculations. + +thankfully they're trivial to handle without a full zero-copy +implementation, so i still stand by my assertions. + +please to submit more filters/sources/sinks i haven't considered yet. + +Dean + +From dgaudet-list-new-httpd@arctic.org Tue Mar 28 11:38:30 2000 +Path: engelschall.com!mail2news!apache.org!new-httpd-return-1494-rse+apache=en.muc.de +From: dgaudet-list-new-httpd@arctic.org (Dean Gaudet) +Newsgroups: en.lists.apache-new-httpd +Subject: Re: layered I/O (was: cvs commit: ...) +Date: 28 Mar 2000 07:02:15 +0200 +Organization: Mail2News at engelschall.com +Lines: 47 +Approved: postmaster@m2ndom +Message-ID: +Reply-To: new-httpd@apache.org +NNTP-Posting-Host: en1.engelschall.com +X-Trace: en1.engelschall.com 954219735 45661 141.1.129.1 (28 Mar 2000 05:02:15 GMT) +X-Complaints-To: postmaster@engelschall.com +NNTP-Posting-Date: 28 Mar 2000 05:02:15 GMT +X-Mail2News-Gateway: mail2news.engelschall.com +Xref: engelschall.com en.lists.apache-new-httpd:37016 + +On Mon, 27 Mar 2000, Roy T. Fielding wrote: + +> Whatever we do, it needs to be clean enough to enable later performance +> enhancements along the lines of zero-copy streams. + +good luck, i think it's impossible. any code written to a read/write +interface is passing around buffers with no reference counts, and the +recipients of those buffers (the i/o layers) must immediately copy the +contents before returning to the caller. zero-copy requires reference +counted buffers. therefore any future zero-copy enhancement would require +substantial code changes to the callers -- the modules. + +btw -- the code, as is, already supports zero-copy for those cases where +it's actually a win... the cases where bwrite() is called with a large +enough buffer, and we're able to pass it to the kernel immediately. + +i honestly believe there are very few applications which benefit from +zero-copy. encryption and compression obvious don't, they require a copy. +what other layers would there be in the stack? + +a mod_include-type filter which was doing zero-copy would probably be +slower than it is now... that'd be using zero-copy to pass little bits and +pieces of strings, the bits and pieces which were unchanged by the filter. +zero-copy has all this overhead in maintaining the lists of buffers and +the reference counts... more overhead in that than in the rather simple +heuristics present in BUFF. + +a MUX layer might benefit from zero-copy ... but after doing lots of +thinking on this a year ago i remain completely unconvinced that zero-copy +from end to end is any better than the heuristics we already have... and +the answer is different depending on how parallel mux requests are +serviced (whether by threads or by an async core). + +there is one application i know of that benefits from zero-copy -- and +that is TCP to TCP tunnelling. but even here, the biggest win i've seen +is not from the zero-copy per se, as much as it is from wins you can get +in reducing the need to have a bunch of 4k userland buffers for each +socket. + +zero-copy is a very nice theoretical toy. i'm still waiting for a good +demonstration or use case. + +i hope you don't hold up improvements in apache while this research is +going on. + +Dean + +From dgaudet-list-new-httpd@arctic.org Tue Mar 28 11:40:37 2000 +Path: engelschall.com!mail2news!apache.org!new-httpd-return-1514-rse+apache=en.muc.de +From: dgaudet-list-new-httpd@arctic.org (Dean Gaudet) +Newsgroups: en.lists.apache-new-httpd +Subject: Re: layered I/O (was: cvs commit: ...) +Date: 28 Mar 2000 07:04:08 +0200 +Organization: Mail2News at engelschall.com +Lines: 22 +Approved: postmaster@m2ndom +Message-ID: +Reply-To: new-httpd@apache.org +NNTP-Posting-Host: en1.engelschall.com +X-Trace: en1.engelschall.com 954219848 46333 141.1.129.1 (28 Mar 2000 05:04:08 GMT) +X-Complaints-To: postmaster@engelschall.com +NNTP-Posting-Date: 28 Mar 2000 05:04:08 GMT +X-Mail2News-Gateway: mail2news.engelschall.com +Xref: engelschall.com en.lists.apache-new-httpd:37028 + +On Sun, 26 Mar 2000, Greg Stein wrote: + +> Below, you talk about doing this without performance implications. Well, +> that loop is one that you've added. :-) + +it's very easy to optimize the loop further -- by hashing the strings +which run the direct matches. + +it's really helpful to consider a simple example: + +accessing foo.cgi, which generates "Content-Type: text/x-parsed-html" +which requires mod_include to run. + +to run foo.cgi r->handler is set to "cgi-handler", which assuming we do +the hash right, is picked off immediately and run without looping. + +then r->content_type is updated, and set to "text/x-parsed-html", and +again, if we've done the hash right, then it's picked off immediately +without looping. + +Dean + +From bhyde@pobox.com Wed Mar 29 08:28:32 2000 +Path: engelschall.com!mail2news!apache.org!new-httpd-return-1539-rse+apache=en.muc.de +From: bhyde@pobox.com (Ben Hyde) +Newsgroups: en.lists.apache-new-httpd +Subject: Re: canonical list of i/o layering use cases +Date: 28 Mar 2000 19:26:56 +0200 +Organization: Mail2News at engelschall.com +Lines: 44 +Approved: postmaster@m2ndom +Message-ID: <87ln335fbt.fsf@pobox.com> +Reply-To: new-httpd@apache.org +NNTP-Posting-Host: en1.engelschall.com +X-Trace: en1.engelschall.com 954264417 68771 141.1.129.1 (28 Mar 2000 17:26:57 GMT) +X-Complaints-To: postmaster@engelschall.com +NNTP-Posting-Date: 28 Mar 2000 17:26:57 GMT +X-Mail2News-Gateway: mail2news.engelschall.com +Xref: engelschall.com en.lists.apache-new-httpd:37045 + +Dean Gaudet writes: +> please to submit more filters/sources/sinks i haven't considered yet. + +:-) + +This discussion flairs up from time to time, doesn't it! + +The last time this flared up I came to the opinion that +there is a knot here worth illuminating, but I can't recall +if I bothered to say it outloud here. + +The inside of the knot is: planning, authorizing, executing. + +The request arrives and the beast has to assemble a plan for +how to respond. These plans can, and ought to be, reasonably +ornate; at least a tree. The primitive nodes are things like +stream this file, do this character set conversion, etc. The +slightly more complex nodes do things like store results in +caches, and assemble bits into bundles, etc. The core ought +to provide a way to manipulate these plans. The types of nodes +and the operation sets on them should be provided by modules. + +Given the plan then the problem is to decide if this is approprate +to execute it. I.e. do we have all the rights we need. This is +a mess because it needs to draw information from three domains, +the client's credentials, the process/thread rights, and the +protection configuration on the named objects that are inputs to +the plan. + +Finally execution. The execution is where we start wanting +terrific efficencies. Zero copy, clever caching, kernel hackery, +levering O/S specific delights like sendfile. + +The outside of the knot of plan/auth/exec is the necessity of +letting all three phases spread across machines, processes, threads, +code cult, and projects. (This is why that www.spread.org work is +so right on). + +I suspect it was at about this point in my thinking that I become +happy that this was slipping outside the scope of 2.0 where it +could stew longer. + + - ben + +From fielding@kiwi.ICS.UCI.EDU Wed Mar 29 14:05:46 2000 +Path: engelschall.com!mail2news!apache.org!new-httpd-return-1593-rse+apache=en.muc.de +From: fielding@kiwi.ICS.UCI.EDU ("Roy T. Fielding") +Newsgroups: en.lists.apache-new-httpd +Subject: Re: layered I/O (was: cvs commit: ...) +Date: 29 Mar 2000 07:58:05 +0200 +Organization: Mail2News at engelschall.com +Lines: 65 +Approved: postmaster@m2ndom +Message-ID: <200003282119.aa05413@gremlin-relay.ics.uci.edu> +Reply-To: new-httpd@apache.org +NNTP-Posting-Host: en1.engelschall.com +X-Trace: en1.engelschall.com 954309485 4205 141.1.129.1 (29 Mar 2000 05:58:05 GMT) +X-Complaints-To: postmaster@engelschall.com +NNTP-Posting-Date: 29 Mar 2000 05:58:05 GMT +X-Mail2News-Gateway: mail2news.engelschall.com +Xref: engelschall.com en.lists.apache-new-httpd:37088 + +>> Bite me. You're being totally confrontational here, and it isn't called +>> for. You didn't read my statements. I said no change to generators, and +>> definite changes to processors. +> +>no it is called for. +> +>this item has been on the bullet list for as long as apache-2.0 has been a +>wet dream. +> +>there's been a fuckload of hand waving over the years, and *no code*. +>code speaks reams more in my book than anything else. +> +>so far all i'm seeing is more hand waving, and cries that this isn't the +>wet dream folks thought it would be. +> +>welcome to reality. if you folks would stop waving your hands and +>actually try to code it up you'd probably understand our proposed +>solution. + +Dean, this is crossing my tolerance threshold for bullshit. You haven't +even looked at the code that was committed. If you had, you would notice +that it doesn't implement IO-layering. What it implements is IO-relaying +and an infinite handler loop. This isn't handwaving. The code simply +doesn't do IO-layering. Period. + +Layered-IO involves a cascaded sequence of filters that independently +operate on a continuous stream in an incremental fashion. Relayed-IO +is a sequence of processing entities that opportunistically operate +on a unit of data to transform it to some other unit of data, which +can then be made available again to the other processing entities. +The former is called a pipe-and-filter architecture, and the latter +is called a blackboard architecture, and the major distinctions between +the two are: + + 1) in layered-IO, the handlers are identified by construction + of a data flow network, whereas in relayed-IO the handlers + simply exist in a "bag of handlers" and each one is triggered + based on the current data state; + + 2) in layered-IO, the expectation is that the data is processed + as a continuous stream moving through handlers, whereas in + relayed-IO the data is operated upon in complete units and + control is implicitly passed from one processor to the next; + + 3) in layered-IO, data processing ends at the outer layer, + whereas in relayed-IO it ends when the data reaches a special + state of "no processing left to be done". + +Yes, these two architectures are similar and can accomplish the +same tasks, but they don't have the same performance characteristics +and they don't have the same configuration interface. And, perhaps +most significantly, relayed-IO systems are not as reliable because +it is very hard to anticipate how processing will occur and very easy +for the system to become stuck in an infinite loop. + +I don't want a blackboard architecture in Apache, regardless of the +version of the release or how many users might be satisfied by the +features it can implement. It is unreliable and hard to maintain +and adds too much latency to the response processing. But if somebody +else really wants such an architecture, and they understand its implications, +then I won't prevent them from going with that solution -- I just +don't want them thinking it is what we meant by layered-IO. + +....Roy + +From fielding@kiwi.ICS.UCI.EDU Wed Mar 29 14:06:34 2000 +Path: engelschall.com!mail2news!apache.org!new-httpd-return-1594-rse+apache=en.muc.de +From: fielding@kiwi.ICS.UCI.EDU ("Roy T. Fielding") +Newsgroups: en.lists.apache-new-httpd +Subject: Re: layered I/O (was: cvs commit: ...) +Date: 29 Mar 2000 07:58:15 +0200 +Organization: Mail2News at engelschall.com +Lines: 56 +Approved: postmaster@m2ndom +Message-ID: <200003282148.aa06947@gremlin-relay.ics.uci.edu> +Reply-To: new-httpd@apache.org +NNTP-Posting-Host: en1.engelschall.com +X-Trace: en1.engelschall.com 954309496 4516 141.1.129.1 (29 Mar 2000 05:58:16 GMT) +X-Complaints-To: postmaster@engelschall.com +NNTP-Posting-Date: 29 Mar 2000 05:58:16 GMT +X-Mail2News-Gateway: mail2news.engelschall.com +Xref: engelschall.com en.lists.apache-new-httpd:37089 + +>On Mon, 27 Mar 2000, Roy T. Fielding wrote: +> +>> Whatever we do, it needs to be clean enough to enable later performance +>> enhancements along the lines of zero-copy streams. +> +>good luck, i think it's impossible. any code written to a read/write +>interface is passing around buffers with no reference counts, and the +>recipients of those buffers (the i/o layers) must immediately copy the +>contents before returning to the caller. zero-copy requires reference +>counted buffers. therefore any future zero-copy enhancement would require +>substantial code changes to the callers -- the modules. + +Yes, assuming we did zero copies. We could still do the interim thing +with one-copy. I agree that taking advantage of the higher performance +would require that the callers be attuned to the high-performance interface. + +>btw -- the code, as is, already supports zero-copy for those cases where +>it's actually a win... the cases where bwrite() is called with a large +>enough buffer, and we're able to pass it to the kernel immediately. +> +>i honestly believe there are very few applications which benefit from +>zero-copy. encryption and compression obvious don't, they require a copy. +>what other layers would there be in the stack? +> +>a mod_include-type filter which was doing zero-copy would probably be +>slower than it is now... that'd be using zero-copy to pass little bits and +>pieces of strings, the bits and pieces which were unchanged by the filter. +>zero-copy has all this overhead in maintaining the lists of buffers and +>the reference counts... more overhead in that than in the rather simple +>heuristics present in BUFF. + +On the contrary, the vast majority of include-based templates +consist of large junks of HTML with embedded separators. With zero-copy +you can just split the data into three buckets by reference and +replace the middle bucket with the included content, which may +itself be a data stream. Not only does this reduce memory consumption, +it also removes almost all of the special-case handling of data sources +via subrequests/caching/proxy/whatever and vastly simplifies the +SSI/PHP/whatever processing architecture. + +>a MUX layer might benefit from zero-copy ... but after doing lots of +>thinking on this a year ago i remain completely unconvinced that zero-copy +>from end to end is any better than the heuristics we already have... and +>the answer is different depending on how parallel mux requests are +>serviced (whether by threads or by an async core). + +The place where I need zero copy is in the request processing, where +the first read off the network may result in multiple requests being +placed within the same buffer. I don't want the initial request to +be copied into separate buffers, since I still consider that initial +copy to be more overhead than all of the reference counting combined. +Maybe I'm just being too pessimistic about the cost of a data copy, +and I should optimize around one-copy instead. + +....Roy + +From jwbaker@acm.org Wed Mar 29 14:07:15 2000 +Path: engelschall.com!mail2news!apache.org!new-httpd-return-1598-rse+apache=en.muc.de +From: jwbaker@acm.org ("Jeffrey W. Baker") +Newsgroups: en.lists.apache-new-httpd +Subject: Re: layered I/O (was: cvs commit: ...) +Date: 29 Mar 2000 13:45:26 +0200 +Organization: Mail2News at engelschall.com +Lines: 163 +Approved: postmaster@m2ndom +Message-ID: +Reply-To: new-httpd@apache.org +NNTP-Posting-Host: en1.engelschall.com +X-Trace: en1.engelschall.com 954330326 23160 141.1.129.1 (29 Mar 2000 11:45:26 GMT) +X-Complaints-To: postmaster@engelschall.com +NNTP-Posting-Date: 29 Mar 2000 11:45:26 GMT +X-Mail2News-Gateway: mail2news.engelschall.com +Xref: engelschall.com en.lists.apache-new-httpd:37093 + +On Tue, 28 Mar 2000, Roy T. Fielding wrote: +[ed] +> Layered-IO involves a cascaded sequence of filters that independently +> operate on a continuous stream in an incremental fashion. Relayed-IO +> is a sequence of processing entities that opportunistically operate +> on a unit of data to transform it to some other unit of data, which +> can then be made available again to the other processing entities. +> The former is called a pipe-and-filter architecture, and the latter +> is called a blackboard architecture, and the major distinctions between +> the two are: +> +> 1) in layered-IO, the handlers are identified by construction +> of a data flow network, whereas in relayed-IO the handlers +> simply exist in a "bag of handlers" and each one is triggered +> based on the current data state; +> +> 2) in layered-IO, the expectation is that the data is processed +> as a continuous stream moving through handlers, whereas in +> relayed-IO the data is operated upon in complete units and +> control is implicitly passed from one processor to the next; +> +> 3) in layered-IO, data processing ends at the outer layer, +> whereas in relayed-IO it ends when the data reaches a special +> state of "no processing left to be done". + +Forgive me for jumping in here. Sometimes those of us who are merely +observers of the core group do not get a perspective on the design +discussions that take place in private emails and in person. Thus, what I +have to say will largely rehash what has been said already. + +It seems to me that a well-rounded IO-layering system has already been +proposed here, in bits and pieces, by different people, over the course of +many threads. The components of the system are: 1. a routine to place the +IO layers in the proper order, 2. a routine to send data between IO +layers, and 3. the layers themselves. + +Selection of IO Layers + +The core selects a source module and IO layers based on the urlspace +configuration. Content might be generated by mod_perl, and the result is +piped through mod_chunk, mod_ssl, and mod_net, in turn. When the content +generator runs, the core enforces that the module set the content type +before the first call to ap_bput. The content type is set by a function +call. The function (ap_set_content_type(request_rec *, char *)) examines +the content type and adds IO layers as neccessary. For server parsed +html, the core might insert mod_include immediately after mod_perl. + +(Can anyone produce a use case where the IO chain could change after +output begins?) + +Interface Between IO Layers + +The core is responsible for marshalling data between the IO layers. Each +layer registers a callback function ((* ap_status_t)(request_rec *, +buff_vec *)) on which it receives input. Data is sent to the next layer +using ap_bput(request_rec *, buff_vec *). The buff_vec is simply an +ordered array of address and length pairs. Whenever ap_bput is called, +the input callback of the next layer is called. No message queueing, +async handlers, or any of that business is needed. ap_bput keeps track of +where in the output chain things are. Control flow in this systems tends +to yo-yo up and down the IO chain. Examples later. + +The only other part of the IO interface is a flush routine. The IO layers +are free to implement whatever they feel flushing involves. + +There are two notable things about this system. First, control flow need +not ever reach the end of the output chain. Any layer is free to return +without calling ap_bput. The layers can do whatever they please with the +data. The network module would be such an example. It would always write +the buffers over the network, and never pass them down the IO chain. If +mod_ssl wanted to handle networking itself, it could do that, too. The +second notable thing is that once a buffer has been sent down the chain, +it is gone forever. Later layers are responsible for freeing the memory +and whatnot. Diddling in a buffer that has already been sent would be bad +form. + +Layer Implementation + +This system has implications for the design and implementation of the +layers. Clearly, it would not be efficient to call ap_bput overly much. +Also, the IO layers must be re-entrant in the threaded MPMs, so they will +need some mechanism for storing module-specific state information in the +request context (think mod_include when an include directive spans ap_bput +calls). + +There will be basically three types of layers: those that insert content +into the stream (chunking, SSI), those that replace the stream completely +(encryption, compression), and those that sink the stream (network). The +layers all demonstrate minimal copying: the inserting layers merely move +the boundaries on the incoming buffers and insert a new buffer. The +replacement layers have to create a new buffer and dealloc the old one, +but you can't avoid that in any case. The sinks merely dealloc the +buffers, so no problems there. + +Analysis by Example + +I considered two examples when coming up with this design. One is content +which is dynamically generated by mod_perl, filtered through SSI, chunked, +encrypted, and sent over the wire. The other is fast static content +serving, where a module is blasting out pre-computed HTTP responses a la +SGI's 10x patches. + +In the first situation, imagine that a 10 KB document is generated which +contains two include directives. The include directives insert a standard +banner and the contents of a 40 KB file. The generating module outputs +the data via one ap_set_content_type call and five separate ap_bput calls. +To see the worst case, assume that both include directives span ap_bput +calls. Assume that the included content does not contain any include +directives. + +The IO chain initially looks like this: + +mod_perl->mod_chunk->mod_ssl->mod_net + +After the content type is set, the chain changes: + +mod_perl->mod_include->mod_chunk->mod_ssl->mod_net + +During the inclusion of the 40 KB file, mod_include allocates a series of +4 KB buffers, fills them from the file, and sends them down the chain (or +maybe it uses mmap). The analysis is left to the reader, but the end +result is that ap_bput is called 50 times during the request phase. Is +that a lot? Consider the amount of work being done, and the fact that we +have avoided all the overhead of using, for example, actual pipes, or +thread-safe queueing. Calling functions in a single userland context is +known to be fast. The number of calls could be reduced if mod_include +used a larger internal buffer, but at the expense of memory consumption +(or it could use mmap). Note also that the number of ap_bput calls does +not translate into packets on the wire. mod_net is free to do whatever is +optimal with respect to packet boundaries. + +The second example represents high performance static content delivery. +The content-generating module has all headers and content cached or mapped +in memory. The entire output phase is accomplished in a single ap_bput +call, and the networking module does The Right Thing to ensure best +network usage. + +Am I rambling yet? I'd like to get some opinions on this system, if +anybody feels it is significantly different from those already proposed. +I realize that I have waved my hands regarding actually deciding when to +use what IO layers and where, but I am confident that a logically +appealing system could be devised. + +Regards, +Jeffrey + +> +> Yes, these two architectures are similar and can accomplish the +> same tasks, but they don't have the same performance characteristics +> and they don't have the same configuration interface. And, perhaps +> most significantly, relayed-IO systems are not as reliable because +> it is very hard to anticipate how processing will occur and very easy +> for the system to become stuck in an infinite loop. +> +> I don't want a blackboard architecture in Apache, regardless of the +> version of the release or how many users might be satisfied by the +> features it can implement. It is unreliable and hard to maintain +> and adds too much latency to the response processing. But if somebody +> else really wants such an architecture, and they understand its implications, +> then I won't prevent them from going with that solution -- I just +> don't want them thinking it is what we meant by layered-IO. + + +From fielding@kiwi.ICS.UCI.EDU Wed Mar 29 14:07:48 2000 +Path: engelschall.com!mail2news!apache.org!new-httpd-return-1599-rse+apache=en.muc.de +From: fielding@kiwi.ICS.UCI.EDU ("Roy T. Fielding") +Newsgroups: en.lists.apache-new-httpd +Subject: Re: layered I/O (was: cvs commit: ...) +Date: 29 Mar 2000 13:45:30 +0200 +Organization: Mail2News at engelschall.com +Lines: 43 +Approved: postmaster@m2ndom +Message-ID: <200003290205.aa19557@gremlin-relay.ics.uci.edu> +Reply-To: new-httpd@apache.org +NNTP-Posting-Host: en1.engelschall.com +X-Trace: en1.engelschall.com 954330330 23243 141.1.129.1 (29 Mar 2000 11:45:30 GMT) +X-Complaints-To: postmaster@engelschall.com +NNTP-Posting-Date: 29 Mar 2000 11:45:30 GMT +X-Mail2News-Gateway: mail2news.engelschall.com +Xref: engelschall.com en.lists.apache-new-httpd:37094 + +>Selection of IO Layers +> +>The core selects a source module and IO layers based on the urlspace +>configuration. Content might be generated by mod_perl, and the result is +>piped through mod_chunk, mod_ssl, and mod_net, in turn. When the content +>generator runs, the core enforces that the module set the content type +>before the first call to ap_bput. The content type is set by a function +>call. The function (ap_set_content_type(request_rec *, char *)) examines +>the content type and adds IO layers as neccessary. For server parsed +>html, the core might insert mod_include immediately after mod_perl. + +The problem of thinking of it that way is that, like Dean mentioned, +the output of one module may be filtered and the filter indicate that +content should be embedded from another URL, which turns out to be a +CGI script that outputs further parseable content. In this instance, +the goal of layered-IO is to abstract away such behavior so that the +instance is processed recursively and thus doesn't result in some tangled +mess of processing code for subrequests. Doing it requires that each +layer be able to pass both data and metadata, and have both data and +metadata be processed at each layer (if desired), rather than call a +single function that would set the metadata for the entire response. + +My "solution" to that is to pass three interlaced streams -- data, +metadata, and meta-metadata -- through each layer. The metadata +streams would point to a table of tokenized name-value pairs. +There are lots of ways to do that, going back to my description of +bucket brigades long ago. Basically, each block of memory would +indicate what type of data, with metadata occurring in a block before +the data block(s) that it describes (just like chunk-size describes +the subsequent chunk-data) and the layers could be dynamically +rearranged based on the metadata that passed through them, in +accordance with the purpose of the filter. + +>(Can anyone produce a use case where the IO chain could change after +>output begins?) + +Output is a little easier, but that is the normal case for input. +We don't know what filters to apply to the request body until after +we have passed through the HTTP headers, and the HTTP message processor +is itself a filter in this model. + +....Roy + +From dgaudet@arctic.org Thu Nov 18 17:25:06 1999 +Path: engelschall.com!mail2news!apache.org!new-httpd-owner-rse+apache=en.muc.de +From: dgaudet@arctic.org (Dean Gaudet) +Newsgroups: en.lists.apache-new-httpd +Subject: Re: bucket brigades and IOL +Date: 18 Nov 1999 06:47:20 +0100 +Organization: Mail2News at engelschall.com +Lines: 23 +Approved: postmaster@m2ndom +Message-ID: +Reply-To: new-httpd@apache.org +NNTP-Posting-Host: en1.engelschall.com +X-Trace: en1.engelschall.com 942904040 82793 141.1.129.1 (18 Nov 1999 05:47:20 GMT) +X-Complaints-To: postmaster@engelschall.com +NNTP-Posting-Date: 18 Nov 1999 05:47:20 GMT +X-Mail2News-Gateway: mail2news.engelschall.com +Xref: engelschall.com en.lists.apache-new-httpd:34101 + + + +On Sat, 13 Nov 1999, Ben Laurie wrote: + +> Also, the usual objections still apply - i.e. it is awkward to do things +> like searching for particular strings, since they may cross boundaries. +> I'm beginning to think that the right answer to this is to provide nice +> matching functions that know about the chunked structures, and last +> resort functions that'll glue it all back into one chunk... + +yeah, we use a zero-copy library at criticalpath and we frequently run +into the case where we want to do some string-like operation on data in +the zero-copy datastructure. you end up having to either copy it to a +regular C string, or re-write all of the string functions. consider +flex... or a regex library... neither work terribly well in the face of a +zero-copy abstraction because they don't have the equivalent of +writev()/readv(). + +but. if apache had it, maybe we would see more libraries start to adopt +iovec-like interfaces... dunno. + +Dean + +From dgaudet@arctic.org Sat Nov 20 21:09:50 1999 +Path: engelschall.com!mail2news!apache.org!new-httpd-owner-rse+apache=en.muc.de +From: dgaudet@arctic.org (Dean Gaudet) +Newsgroups: en.lists.apache-new-httpd +Subject: Re: NO_WRITEV +Date: 20 Nov 1999 06:42:11 +0100 +Organization: Mail2News at engelschall.com +Lines: 49 +Approved: postmaster@m2ndom +Message-ID: +Reply-To: new-httpd@apache.org +NNTP-Posting-Host: en1.engelschall.com +X-Trace: en1.engelschall.com 943076531 76767 141.1.129.1 (20 Nov 1999 05:42:11 GMT) +X-Complaints-To: postmaster@engelschall.com +NNTP-Posting-Date: 20 Nov 1999 05:42:11 GMT +X-Mail2News-Gateway: mail2news.engelschall.com +Xref: engelschall.com en.lists.apache-new-httpd:34157 + +writev() allows us to reduce the number of packets on the network. + +on linux we could use TCP_CORK and get the same effect with less code... +too bad linux is the only unix so far with this nice functionality. +TCP_CORK is like the "other useful" setting other than nagle turned on... +with TCP_CORK, the kernel flushes any packets which fill an entire frame, +but holds partial packets until the socket is close()d or the cork is +removed. in this way you can do multiple smaller write()s (or a write() +and a sendfile()) without causing small packets to go on the wire. + +writev() may consume a small amount more cpu on the server, but it's my +opinion that this is a fine price to pay for fewer packets on the wire. + +if you do choose to benchmark it, be sure to use slow modem clients, and +not fast lan clients... and give strong consideration to using client +latency as your metric rather than server cpu. + +http://www.w3.org/Protocols/HTTP/Performance/Pipeline.html + +Dean + +On Fri, 19 Nov 1999, Eli Marmor wrote: + +> Hello, +> +> Is there any benchmark or statistics about how faster is Apache with +> writev in comparison to without it? (i.e. a normal compilation under +> a platform supporting writev, vs. a compilation with "-DNO_WRITEV"). +> +> If there is no official benchmark, can anybody estimate the difference +> (in percents) under a typical use of Apache? Does it depend on the +> type of use (static pages vs. dynamic, SSI and other parsings vs. one +> block, etc.)? Or the other bottlenecks? (it makes sense that when +> your connection to the client is slow, you will not notice the +> difference between writing 2 buffers in two system calls or in one). +> +> If there is a big difference, does it mean that Apache for non-writev +> platforms (such as SCO, BeOS and Tandem) is slower and that these +> platforms are not recommended for Apache users? +> +> On the other hand, if the difference is very small (let's say lower +> than 1%), maybe the benefits don't deserve the price (a much more +> complex code, especially when the non-writev code must be supported +> also in the future because of the non-writev platforms). +> +> -- +> Eli Marmor +> + +From dgaudet@arctic.org Mon Jun 28 19:06:50 1999 +Path: engelschall.com!mail2news!apache.org!new-httpd-owner-rse+apache=en.muc.de +From: dgaudet@arctic.org (Dean Gaudet) +Newsgroups: en.lists.apache-new-httpd +Subject: Re: async routines +Date: 28 Jun 1999 17:33:24 +0200 +Organization: Mail2News at engelschall.com +Lines: 96 +Approved: postmaster@m2ndom +Message-ID: +Reply-To: new-httpd@apache.org +NNTP-Posting-Host: en1.engelschall.com +X-Trace: en1.engelschall.com 930584004 99816 141.1.129.1 (28 Jun 1999 15:33:24 GMT) +X-Complaints-To: postmaster@engelschall.com +NNTP-Posting-Date: 28 Jun 1999 15:33:24 GMT +X-Mail2News-Gateway: mail2news.engelschall.com +Xref: engelschall.com en.lists.apache-new-httpd:31280 + +[hope you don't mind me cc'ing new-httpd zach, I think others will be +interested.] + +On Mon, 28 Jun 1999, Zach Brown wrote: + +> so dean, I was wading through the mpm code to see if I could munge the +> sigwait stuff into it. +> +> as far as I could tell, the http protocol routines are still blocking. +> what does the future hold in the way for async routines? :) I basically +> need a way to do something like.. + +You're still waiting for me to get the async stuff in there... I've done +part of the work -- the BUFF layer now supports non-blocking sockets. + +However, the HTTP code will always remain blocking. There's no way I'm +going to try to educate the world in how to write async code... and since +our HTTP code has arbitrary call outs to third party modules... It'd +have a drastic effect on everyone to make this change. + +But I honestly don't think this is a problem. Here's my observations: + +All the popular HTTP clients send their requests in one packet (or two +in the case of a POST and netscape). So the HTTP code would almost +never have to block while processing the request. It may block while +processing a POST -- something which someone else can worry about later, +my code won't be any worse than what we already have in apache. So +any effort we put into making the HTTP parsing code async-safe would +be wasted on the 99.9% case. + +Most responses fit in the socket's send buffer, and again don't require +async support. But we currently do the lingering_close() routine which +could easily use async support. Large responses also could use async +support. + +The goal of HTTP parsing is to figure out which response object to +send. In most cases we can reduce that to a bunch of common response +types: + +- copying a file to the socket +- copying a pipe/socket to the socket (IPC, CGIs) +- copying a mem region to the socket (mmap, some dynamic responses) + +So what we do is we modify the response handlers only. We teach them +about how to send async responses. + +There will be a few new primitives which will tell the core "the response +fits one of these categories, please handle it". The core will do the +rest -- and for MPMs which support async handling, the core will return +to the MPM and let the MPM do the work async... the MPM will call a +completion function supplied by the core. (Note that this will simplify +things for lots of folks... for example, it'll let us move range request +handling to a common spot so that more than just default_handler +can support it.) + +I expect this to be a simple message passing protocol (pass by reference). +Well rather, that's how I expect to implement it in ASH -- where I'll +have a single thread per-process doing the select/poll stuff; and the +other threads are in a pool that handles the protocol stuff. For your +stuff you may want to do it another way -- but we'll be using a common +structure that the core knows about... and that structure will look like +a message: + + struct msg { + enum { + MSG_SEND_FILE, + MSG_SEND_PIPE, + MSG_SEND_MEM, + MSG_LINGERING_CLOSE, + MSG_WAIT_FOR_READ, /* for handling keep-alives */ + ... + } type; + BUFF *client; + void (*completion)(struct msg *, int status); + union { + ... extra data here for whichver types need it ...; + } x; + }; + +The nice thing about this is that these operations are protocol +independant... at this level there's no knowledge of HTTP, so the same +MPM core could be used to implement other protocols. + +> so as I was thinking about this stuff, I realized it might be neat to have +> 'classes' of non blocking pending work and have different threads with +> differnt priorities hacking on it. Say we have a very high priority +> thread that accepts connectoins, does initial header parsing, and +> sendfile()ing data out. We could have lower priority threads that are +> spinning doing 'harder' BUFF work like an encryption layer or gziping +> content, whatever. + +You should be able to implement this in your MPM easily I think... because +you'll see the different message types and can distribute them as needed. + +Dean + Index: ossp-pkg/sio/BRAINSTORM/doc_SFmtg.txt RCS File: /v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/doc_SFmtg.txt,v co -q -kk -p'1.1' '/v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/doc_SFmtg.txt,v' | diff -u /dev/null - -L'ossp-pkg/sio/BRAINSTORM/doc_SFmtg.txt' 2>/dev/null --- ossp-pkg/sio/BRAINSTORM/doc_SFmtg.txt +++ - 2024-05-20 02:18:02.370908234 +0200 @@ -0,0 +1,172 @@ + +From akosut@leland.Stanford.EDU Thu Jul 23 09:38:40 1998 +Date: Sun, 19 Jul 1998 00:12:37 -0700 (PDT) +From: Alexei Kosut +To: new-httpd@apache.org +Subject: Apache 2.0 - an overview + +For those not at the Apache meeting in SF, and even for those who were, +here's a quick overview of (my understanding of) the Apache 2.0 +architecture that we came up with. I present this to make sure that I have +it right, and to get opinions from the rest of the group. Enjoy. + + +1. "Well, if we haven't released 2.0 by Christmas of 1999, it won't + matter anyway." + +A couple of notes about this plan: I'm looking at this right now from a +design standpoint, not an implementation one. If the plan herein were +actually coded as-is, you'd get a very inefficient web server. But as +Donald Knuth (Professor emeritus at Stanford, btw... :) points out, +"premature optimization is the root of all evil." Rest assured there are +plenty of ways to make sure Apache 2.0 is much faster than Apache 1.3. +Taking out all the "slowness" code, for example... :) + +Also, the main ideas in this document mainly come from Dean Gaudet, Simon +Spero, Cliff Skolnick and a bunch of other people, from the Apache Group's +meeting in San Francisco, July 2 and 3, 1998. The other ideas come from +other people. I'm being vague because I can't quite remember. We should +have videotaped it. I've titled the sections of this document with quotes +from our meeting, but they are paraphrased from memory, so don't take them +too seriously. + +2. "But Simon, how can you have a *middle* end?" + +One of the main goals of Apache 2.0 is protocol independence (i.e., +serving HTTP/1.1, HTTP-NG, and maybe FTP or gopher or something). Another +is to rid the server of the belief that everything is a file. Towards this +end, we divide the server up into three parts, the front end, the middle +end, and the back end. + +The front end is essentially a combination of http_main and http_protocol +today. It takes care of all network and protocol matters, interpreting the +request, putting it into a protocol-neutral form, and (possibly) passing +it off to the rest of the server. This is approximately equivalent to the +part of Apache contained in Dean's flow stuff, and it also works very well +in certain non-Unix-like architectures such as clustered mainframes. In +addition, part of this front-end might be optionally run in kernel space, +giving a very fast server indeed... + +The back end is what generates the content. At the back of the back end we +have backing stores (Cliff's term), which contain actual data. These might +represent files on a disk, entries in a database, CGI scripts, etc... The +back end also consists of other modules, which can alter the request in +various fashions. The objects the server acts on can be thought of (Cliff +again) as a filehandle and a set of key/value pairs (metainformation). +The modules are set up as filters that can alter either one of those, +stacking I/O routines onto the stream of data, or altering the +metainformation. + +The middle end is what comes between the front and back ends. Think of +http_request. This section takes care of arranging the modules, backing +stores, etc... into a manner so that the path of the request will result +in the correct entity being delivered to the front end and sent to the +client. + +3. "I won't embarrass you guys with the numbers for how well Apache + performs compared to IIS." (on NT) + +For a server that was designed to handle flat files, Apache does it +surprisingly poorly, compared with other servers that have been optimized +for it. And the performance for non-static files is, of course, worse. +While Apache is still more than fast enough for 95% of Web servers, we'd +be remiss to dismiss those other 5% (they're the fun ones anyway). Another +problem Apache has is its lack of a good, caching, proxy module. + +Put these together, along with the work Dean has done with the flow and +mod_mmap_static stuff, and we realize the most important part of Apache +2.0: a built-in, all-pervasive, cache. Every part of the request process +will involve caching. In the path outlined above, between each layer of +the request, between each module, sits the cache, which can (when it is +useful), cache the response and its metainformation - including its +variance, so it knows when it is safe to give out the cached copy. This +gives every opportunity to increase the speed of the server by making sure +it never has to dynamically create content more than it needs to, and +renders accelerators such as Squid unnecessary. + +This also allows what I alluded to earlier: a kernel (or near-to-kernel) +based web server component, which could read the request, consult the +cache to find the requested object, and spit it back out, without so much +as an interrupt in the way. Of course, the rest of Apache (with all its +modules - it's generally a bad idea to let unknown, untrusted code, insert +itself into the kernel) sits up in user-space, ready to handle any request +the micro-Apache can't. + +A built-in cache also makes a real working HTTP/1.1 proxy server trivially +easy to write. + +4. "Stop asking about backwards compatibility with the API. We'll write a + compatibility module... later." + +If modules are as described above, then obviously they are very much +distinct from how Apache's current modules function. The only module +function that is similar to the current model is the handler, or backing +store, that actually provides the basic stream of data that the server +alters to product a response entity. + +The basic module's approach to its job is to stack a filter onto the +output. But it's better to think of the modules not as a stack that the +request flows through (a layer cake with cache icing between the layers), +but more of a mosaic (pretend I didn't use that word. I wrote collage. You +can't prove anything), with modules stuck onto various sides of the +request at different points, altering the request/response. + +Today's Apache modules take an all-or-nothing approach to request +handlers. They tell Apache what they can do, overestimating, and then are +supposed to DECLINE if they don't pass a number of checks they are +supposed to make. Most modules don't do this correctly. The better +approach is to allow the modules to inform Apache exactly of what they can +do, and have Apache (the middle-end) take care of invoking them when +appropriate. + +The final goal of all of this, of course, is simply to allow CGI output to +be parsed for server-side includes. But don't tell Dean that. + +5. "Will Apache run without any of the normal Unix binaries installed, + only the BSD/POSIX libraries?" + +Another major issue is, of course, configuration of the server. There are +a number of distinct opinions on this, both as to what should be +configured and how it should be done. We talked mainly about the latter, +but the did touch on the former. Obviously, with a radically distinct +module API, the configuration is radically different. We need a good way +to specify how the modules are supposed to interact, and of controlling +what they can do, when and how, balancing what the user asks the server to +do, and what the module (author) wants the server to do. We didn't really +come up with a good answer to this. + +However, we did make some progress on the other side of the issue: We +agreed that the current configuration system is definitely taking the +right approach. Having a well-defined repository of the configuration +scheme, containing the possible directives, when they are applicable, what +their parameters are, etc... is the right way to go. We agreed that more +information and stronger-typing (no RAW_ARGS!) would be good, and may +enable on-the-fly generated configuration managers. + +We agreed that such a program, probably external to Apache, would generate +a configuration and pass it to Apache, either via a standard config file, +or by calling Apache API functions. It is desirable to be able to go the +other way, pulling current configuration from Apache to look at, and +perhaps change it on the fly, but unfortunately is unlikely this +information would always be available; modules may perform optimizations +on their configuration that makes the original configuration unavailable. + +For the language and specification of the configuration, we thought +perhaps XML might be a good approach, and agreed it should be looked +into. Other issues, such as SNMP, were brought up and laughed at. + +6. "So you're saying that the OS that controls half the banks, and 90% of + the airlines, doesn't even have memory protection for seperate + processes?" + +Obviously, there are a lot more items that have to be part of Apache 2.0, +and we talked about a number of them. However, the four points above, I +think, represent the core of the architecture we agreed on as a starting +point. + +-- Alexei Kosut + Stanford University, Class of 2001 * Apache * + + + + Index: ossp-pkg/sio/BRAINSTORM/doc_bucket_brigades.txt RCS File: /v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/doc_bucket_brigades.txt,v co -q -kk -p'1.1' '/v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/doc_bucket_brigades.txt,v' | diff -u /dev/null - -L'ossp-pkg/sio/BRAINSTORM/doc_bucket_brigades.txt' 2>/dev/null --- ossp-pkg/sio/BRAINSTORM/doc_bucket_brigades.txt +++ - 2024-05-20 02:18:02.373512887 +0200 @@ -0,0 +1,381 @@ +To: new-httpd@apache.org +Subject: bucket brigades and IOL +Date: Fri, 12 Nov 1999 23:57:43 -0800 +From: "Roy T. Fielding" +Message-ID: <199911122357.aa18914@gremlin-relay.ics.uci.edu> + +About two years ago I wasted a lot of time writing an Ada95 library +called Onions that provides a stackable stream abstraction for files, +sockets, etc. It is at +if you want to take a look at it, but I don't recommend looking at the +code since it is almost all just working around Ada95's lack of a +system interface. I'll describe the worthwhile bits here. + +The heart of Onions is the input and output stream object +classes and classwide types for building a data stream via a +stack of stream objects (Input_Pipe and Output_Pipe). Reading +from the head of an input pipe causes the head stream object +to read from the next outbound stream object, and on down the line. +Likewise for writing to the head of an output pipe. One of the +main features of streams is that they can filter the data as it +passes, converting, adding to, and/or removing from the data +before giving it to the next object. Since multiple streams can be +cascaded, the complete data conversion is the sum of the individual +data conversions performed by the stream objects. + +So far, no big deal -- this can be manually created by stacking ap_iol +types in a meaningful way. But, the one unique thing I did in Onions was +abstract the memory handling into something called Buckets and moved them +around in Bucket_Brigades. A bucket is an allocated segment of memory +with pointers to its allocation address and current size. If I were doing +this in C, I'd also add a pointer to current start address and allocated +size, so that a single bucket could be shrunk from both ends without +copying, and a function pointer for freeing it at the stream end. +Note that this is the same type of memory structure that IO-Lite uses, +though developed independently and for different reasons. + +A bucket brigade is a list-queue of buckets. Each of the stream read/write +calls would pass a bucket brigade instead of single bucket, since this +made insertion by filters more efficient, with the general idea being that +the outbound end of the sream would be writing them out using writev +or reading them in using readv, which is about as efficient as I could +get with Ada95. [I call it a list-queue instead of just queue because you +have the choice of removing buckets from (or adding to) the queue one +bucket at a time or an entire linked list of buckets.] + +But we could go one step further. A bucket is an ADT, and as such can +be used as a general handle for read-only memory, read-write memory, +cache object, file handle, mmap handle, file name, URL, whatever. +What if, instead of just a stream of memory, it could pass around a +stream of memory interspersed with file handles or references to +remote objects? A filter could then add stuff around the stream without +causing too much parsing overhead, and if it needed to look at all the +bytes in the stream it would just replace the bucket handle with a stream +of memory sucked from that handle. Something like this was talked about +last year (see threads on "Stacking up Response Handling" on 23 Sep 1998 +and "I/O filters & reference counts" in late December 1998 and January 1999). +And Dean started something with ap_buf.h, but I don't know how he meant +to finish it. + +What I was thinking of was + + typedef enum { + AP_BUCKET_rwmem, + AP_BUCKET_rmem, + AP_BUCKET_file_t, + AP_BUCKET_mmap_t, + AP_BUCKET_filename, + AP_BUCKET_cached_entity, + AP_BUCKET_URI, + } ap_bucket_color_t; + + typedef struct ap_bucket_t ap_bucket_t; + struct ap_bucket_t { + ap_bucket_color_t color; + void *content; + ap_status_t (*free)(ap_bucket_t *bucket); + unsigned int refcount; + }; + + typedef struct ap_bucket_rwmem_t ap_bucket_rwmem_t; + struct ap_bucket_rwmem_t { + void *alloc_addr; + size_t alloc_len; + void *addr; + size_t len; + }; + + typedef struct ap_bucket_rmem_t ap_bucket_rmem_t; + struct ap_bucket_rmem_t { + void *addr; + size_t len; + }; + + typedef struct ap_bucket_filename ap_bucket_filename; + struct ap_bucket_filename { + ap_context_t *ctx; + char *name; + ap_stat_t *stat; /* useful if already stat'ed */ + ap_aa_t *conf; /* access control structure for this file */ + }; + + ... + +and then + + typedef struct ap_bucket_list_t ap_bucket_list_t; + struct ap_bucket_list_t { + ap_bucket_t *bucket; + ap_bucket_list_t *prev; + ap_bucket_list_t *next; + }; + + typedef struct ap_brigade_t ap_brigade_t; + struct ap_brigade_t { + ap_context_t *ctx; + ap_bucket_list_t *first; + ap_bucket_list_t *last; + unsigned int count; + }; + +and then construct the input and output streams as pushing these +bucket brigades to or from the client. The streams would have to +be a little more complicated than Onions, since I learned later that +you also need a parallel stream of header fields (in tokenized form) +in order for it to work with anything HTTP-like. + +Why use an enum instead of a bunch of file pointers for each type +of bucket, kind of like ap_iol? Because it allows adjacent memory +buckets (the most frequent kind after a filter operation) to be +gathered into a single writev. Also, we need a way to be able to +set up an operation and figure out what it will produce without +actually performing the operation -- this is for OPTIONS and HEAD. + +Note that this would completely change the way we handle internal +redirects, subrequests, server-side include, mod_proxy, access control, etc. +And then most of the API hooks would need to change. I think that is why +Dean was putting it off until 2.1. The annoying thing is that this is the +most useful rearchitecting of the server -- the MPM, APR, and hook changes +make 2.0 easier/cleaner/faster to port to other platforms, but layering +enables in one fell swoop almost every significant non-config feature +that our users have requested. A cache would just be a hash table or +btree of file buckets, complete with AA info. + +Anyway, that was stuck in the back of my head and had to get out. +I won't be able to work on it until after the dissertation is done, +which every day seems to be further away. Maybe 3.0, with rHTTP/2.0. + +....Roy + +================================================= +To: new-httpd@apache.org +Subject: Re: bucket brigades and IOL +In-reply-to: Your message of "Sat, 13 Nov 1999 20:43:58 GMT." + <382DCD8E.881B8468@algroup.co.uk> +Date: Sun, 14 Nov 1999 22:24:03 -0800 +From: "Roy T. Fielding" +Message-ID: <199911142224.aa22545@gremlin-relay.ics.uci.edu> + +BenL wrote: +>I've got to say that this is the most coherent suggestion along these +>lines that I've seen yet. I rather like it. One thing I'd add is that if +>you are going to have a movable "start of block" pointer, and changeable +>length, it can be nice to allocate extra around the edges under some +>circumstances, so that lower layers can expand the block without having +>to add extra chunks. + +Or, alternatively, allocate equal size blocks and just pass around +a reference pair within the buckets that, when the bucket is freed, +access a more complicated reference-counting pool. I think that is +closer to what IO-Lite does. + +>Also, the usual objections still apply - i.e. it is awkward to do things +>like searching for particular strings, since they may cross boundaries. +>I'm beginning to think that the right answer to this is to provide nice +>matching functions that know about the chunked structures, and last +>resort functions that'll glue it all back into one chunk... + +Yep, that's what I ended up doing for Ada95, though in that case there +were no easier alternatives. + +....Roy + +================================================= +To: new-httpd@apache.org +Subject: Re: layered I/O (was: cvs commit: ...) +In-reply-to: Your message of "Wed, 29 Mar 2000 01:21:09 PST." + +Date: Wed, 29 Mar 2000 02:05:08 -0800 +From: "Roy T. Fielding" +Message-ID: <200003290205.aa19557@gremlin-relay.ics.uci.edu> + +>Selection of IO Layers +> +>The core selects a source module and IO layers based on the urlspace +>configuration. Content might be generated by mod_perl, and the result is +>piped through mod_chunk, mod_ssl, and mod_net, in turn. When the content +>generator runs, the core enforces that the module set the content type +>before the first call to ap_bput. The content type is set by a function +>call. The function (ap_set_content_type(request_rec *, char *)) examines +>the content type and adds IO layers as neccessary. For server parsed +>html, the core might insert mod_include immediately after mod_perl. + +The problem of thinking of it that way is that, like Dean mentioned, +the output of one module may be filtered and the filter indicate that +content should be embedded from another URL, which turns out to be a +CGI script that outputs further parseable content. In this instance, +the goal of layered-IO is to abstract away such behavior so that the +instance is processed recursively and thus doesn't result in some tangled +mess of processing code for subrequests. Doing it requires that each +layer be able to pass both data and metadata, and have both data and +metadata be processed at each layer (if desired), rather than call a +single function that would set the metadata for the entire response. + +My "solution" to that is to pass three interlaced streams -- data, +metadata, and meta-metadata -- through each layer. The metadata +streams would point to a table of tokenized name-value pairs. +There are lots of ways to do that, going back to my description of +bucket brigades long ago. Basically, each block of memory would +indicate what type of data, with metadata occurring in a block before +the data block(s) that it describes (just like chunk-size describes +the subsequent chunk-data) and the layers could be dynamically +rearranged based on the metadata that passed through them, in +accordance with the purpose of the filter. + +>(Can anyone produce a use case where the IO chain could change after +>output begins?) + +Output is a little easier, but that is the normal case for input. +We don't know what filters to apply to the request body until after +we have passed through the HTTP headers, and the HTTP message processor +is itself a filter in this model. + +....Roy + + +================================================= +To: new-httpd@apache.org +Subject: Re: filtering patches +In-reply-to: Your message of "Mon, 10 Jul 2000 15:33:25 PDT." + +Date: Mon, 10 Jul 2000 16:58:00 -0700 +From: "Roy T. Fielding" +Message-ID: <200007101657.aa21782@gremlin-relay.ics.uci.edu> + +[...] +I meant that the filters, when written to as part of the output stream, +are treated as a stack (write to the top-most filter without any knowledge +of what may lie underneath it). So the process of arranging filters +for a particular response is like dropping them onto a stack. When a +filter is done or the stream is closed, each instantiated filter cleans +up according to its local state and then destroys itself (as it is popped +off the stack). + +This is completely separate from the registration of filters by +name and purpose, which could be done by hooks. The difference is that +filters are registered at config time but only instantiated (given local +storage) and arranged on a per stream basis. + +Bucket brigades is simply a way to encapsulate and pass data down the stream +such that it can be as efficient as the sender desires, while retaining +a simple interface. The purpose of the bucket is to make handling of the +data uniform regardless of its type, or make type-specific conversions +via a single ADT call if and only if they are needed by some filter. +The purpose of the brigade is to reduce the number of calling arguments +and linearize the calling sequence for insertion filters. Each filter +definition is separate from its instantiation on the stream because +there may be many streams operating at once within a single program. +Each bucket is independent of the brigade so that the filters can rearrange +and insert buckets at will. Each data item is isolated by the bucket +structure, which allows them to be split across child buckets or shared +with multiple streams (e.g., cached objects). We don't need to implement +all of this on the first pass -- we just need to implement the ADT external +interfaces such that they don't have to change as we make the overall +stream more efficient. + +BTW, in case I didn't make this clear in past messages, this design is +an amalgam of the best aspects of the designs from Henrik's Streams +(see w3c-libwww), sfio (AT&T Research), IO-Lite (Rice Univ.), and +libwww-ada95 (UCI). The MIME stuff in MSIE is based on Henrik's streams. +Henrik's stuff is very fast, but is spaghetti code because it relies on +callbacks and legacy stuff in libwww. sfio is great but is intended to +be a complete replacement for stdio and hence does way too much and is +subject to a few patents that I don't appreciate. IO-Lite is cool but +is probably only worth it when the entire OS is based on IO-Lite memory +management, but regardless the code isn't available for commercial use. +As Dean has mentioned many times, we already get most of the performance +benefit of IO-Lite simply by avoiding memory copies on large writes. +libwww-ada95 was an attempt to make Ada95 suck less for systems programming, +which was only a partial success (it is very fast compared to other Ada95 +libraries, but memory management became a problem with complex filters). + +Writing our own streams library isn't NIH syndrome -- both Dean and I +have independently investigated the other available alternatives and they +just aren't suitable for our purpose. Even with all that, my own design +only truly pays off (versus plain old BUFF) when you make good use of +sendfile and shared object caching. + +[...] + + +================================================= +Other stuff Roy wrote on new-httpd: + +My buckets are passed around in list-queues (really just lists with front +and back pointers). My buckets carry data and metadata and meta-metadata. +My buckets are used to indicate stream-end, and the filter configuration +itself is determined by the stream content. It probably sounds weird, but +the effects of this interface are completely different than mere content +filters. They simplify everything. I'm not saying that we have to +simplify everything right away, but I am saying that it is just as easy +to implement a fully-general filter using bucket brigades as it is +to implement string interface filters -- all of the complex parts +are handled by the ADTs. + +... + +The real psychedelic stuff happens when you can pass metadata (tokenized +header fields) as buckets and the filters know how to pass that down the +chain without treating them as data. + +... + +The purpose of passing a list of buckets around is to linearize +the call stack for the frequent case of filtered content +splitting one large bucket into separate buckets with filtered results +interspersed in between. The effect is that a filter chain can frequently +process an entire message in one pass down the chain, which enables the +stream end to send the entire response in one go, which also allows it +to do interesting things like provide a content length by summing the +data length of all the buckets' data, and set a last-modified time +by picking the most recent time from a set of static file buckets. + +I think it would help if we stopped using artificial examples. Let's +try something simple: + + socket <-- http <-- add_footer <-- add_header <-- send_file + +send_file calls its filter with an ap_file_t bucket and End-of-Stream (EOS) +in the bucket list. add_header sets a flag, prepends another ap_file_t +bucket to the list and sends the list to its filter. add_footer looks +at the list, finds the EOS, inserts another ap_file_t bucket in +front of the EOS, and sends the list on to its filter. http walks through +the list picking up the (cached) stat values, notes the EOS and seeing +that its own flag for headers_sent is false, sets the cumulative metadata +and sends the header fields, followed by three calls to the kernel to +send out the three files using whatever mechanism is most efficient. + +The point here isn't that this is the only way to implement filters. +The point is that no other interface can implement them as efficiently. +Not even close. Yes, there are cases where string filters are just as +efficient as any other design, but there is no case in which they are +more efficient than bucket brigades. The reason is that being able +to process a list of strings in one call more than offsets the extra +cost of list processing, regardless of the filter type, and allows +for additional features that have benefits for http processing. +Like, for example, being able to determine the entire set of resources +that make up the source of this dynamic resource without teaching every +filter about WebDAV. + +... + +Making many small calls down the filter chain is something best +avoided, which is why the bucket brigades interface consists of +a linked list of buckets, such that all of the currently available +data can be passed-on in a single call. + +Being able to handle sendfile, cached objects and subrequests is very +effective at improving efficiency, which is why the buckets are typed. +A filter that needs to do byte-level processing will have to call a +routine to convert the typed bucket into a data stream, but that decision +is delayed until no other choice is available and adds no overhead to +the common cases of non-filtered or pre-/post-filtered objects. + +Being able to process header fields (metadata) through the same filter +set as the data is necessary for correctness and simplicity in the +proper ordering of independently developed filter modules, which is +why the buckets can carry metadata on the same stream. Every filter +has to be knowledgeable about metadata because only the filter knows +whether or not its actions will change the nature of the data. + + Index: ossp-pkg/sio/BRAINSTORM/doc_dean_iol.txt RCS File: /v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/doc_dean_iol.txt,v co -q -kk -p'1.1' '/v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/doc_dean_iol.txt,v' | diff -u /dev/null - -L'ossp-pkg/sio/BRAINSTORM/doc_dean_iol.txt' 2>/dev/null --- ossp-pkg/sio/BRAINSTORM/doc_dean_iol.txt +++ - 2024-05-20 02:18:02.376250653 +0200 @@ -0,0 +1,496 @@ +goals? we need an i/o abstraction which has these properties: + +- buffered and non-buffered modes + + The buffered mode should look like FILE *. + + The non-buffered mode should look more like read(2)/write(2). + +- blocking and non-blocking modes + + The blocking mode is the "easy" mode -- it's what most module writers + will see. The non-blocking mode is the "hard" mode, this is where + module writers wanting to squeeze out some speed will have to play. + In order to build async/sync hybrid models we need the + non-blocking i/o abstraction. + +- timed reads and writes (for blocking cases) + + This is part of my jihad against asynchronous notification. + +- i/o filtering or layering + + Yet another Holy Grail of computing. But I digress. These are + hard when you take into consideration non-blocking i/o -- you have + to keep lots of state. I expect our core filters will all support + non-blocking i/o, well at least the ones I need to make sure we kick + ass on benchmarks. A filter can deny a switch to non-blocking mode, + the server will have to recover gracefully (ha). + +- copy-avoidance + + Hey what about zero copy a la IO-Lite? After having experienced it + in a production setting I'm no longer convinced of its benefits. + There is an enormous amount of overhead keeping lists of buffers, + and reference counts, and cleanup functions, and such which requires + a lot of tuning to get right. I think there may be something here, + but it's not a cakewalk. + + What I do know is that the heuristics I put into apache-1.3 to choose + writev() at times are almost as good as what you can get from doing + full zero-copy in the cases we *currently* care about. To put it + another way, let's wait another generation to deal with zero copy. + + But sendfile/transmitfile/etc. those are still interesting. + + So instead of listing "zero copy" as a property, I'll list + "copy-avoidance". + +So far? + +- ap_bungetc added +- ap_blookc changed to return the character, rather than take a char *buff +- in theory, errno is always useful on return from a BUFF routine +- ap_bhalfduplex, B_SAFEREAD will be re-implemented using a layer I think +- chunking gone for now, will return as a layer +- ebcdic gone for now... it should be a layer + +- ap_iol.h defined, first crack at the layers... + + Step back a second to think on it. Much like we have fread(3) + and read(2), I've got a BUFF and an ap_iol abstraction. An ap_iol + could use a BUFF if it requires some form of buffering, but many + won't require buffering... or can do a better job themselves. + + Consider filters such as: + - ebcdic -> ascii + - encryption + - compression + These all share the property that no matter what, they're going to make + an extra copy of the data. In some cases they can do it in place (read) + or into a fixed buffer... in most cases their buffering requirements + are different than what BUFF offers. + + Consider a filter such as chunking. This could actually use the writev + method to get its job done... depends on the chunks being used. This + is where zero-copy would be really nice, but we can get by with a few + heuristics. + + At any rate -- the NSPR folks didn't see any reason to included a + buffered i/o abstraction on top of their layered i/o abstraction... so + I feel like I'm not the only one who's thinking this way. + +- iol_unix.c implemented... should hold us for a bit + + +============================== +Date: Mon, 10 Apr 2000 14:39:48 -0700 (PDT) +From: dean gaudet +To: new-httpd@apache.org +Subject: Re: Buff should be an I/O layer +In-Reply-To: <20000410123109.C3931@manojk.users.mindspring.com> +Message-ID: + +[hope you don't mind me taking this back to new-httpd so that it's +archived this time :)] + +On Mon, 10 Apr 2000, Manoj Kasichainula wrote: + +> On Mon, Mar 27, 2000 at 04:48:23PM -0800, Dean Gaudet wrote: +> > On Sat, 25 Mar 2000, Manoj Kasichainula wrote: +> > > (aside: Though my unschooled brain still sees no +> > > problem if our chunking layer maintains a pile of 6-byte blocks that +> > > get used in an iol_writev. I'll read the archived discussions.) +> > +> > there's little in the way of archived discussions, there's just me admitting +> > that i couldn't find a solution which was not complex. +> +> OK, there's got to be something wrong with this: +> +> chunk_iol->iol_write(char *buffer) { +> pull a 10-byte (or whatever) piece out of our local stash +> construct a chunk header in it +> set the iovec = chunk header + buffer +> writev(iovec) +> } +> +> But what is it? + +when i was doing the new apache-2.0 buffering i was focusing a lot on +supporting non-blocking sockets so we could do the async i/o stuff -- and +to support a partial write you need to keep more state than what your +suggestion has. + +also, the real complexity comes when you consider handling a pipelined +HTTP/1.1 connection -- consider what happens when you get 5 requests +for /cgi-bin/printenv smack after the other. + +if you do that against apache-1.3 and the current apache-2.0 you get +back maximally packed packets. but if you make chunking a layer then +every time you add/remove the layer you'll cause a packet boundary -- +unless you add another buffering layer... or otherwise shift around +the buffering. + +as a reminder, visit + for a +description of how much we win on the wire from such an effort. + +also, at some point i worry that passing the kernel dozens of tiny +iovecs is more expensive than an extra byte copy into a staging buffer, +and passing it one large buffer. but i haven't done any benchmarks to +prove this. (my suscipions have to do with the way that at least the +linux kernel's copying routine is written regarding aligned copies) + +oh it's totally worth pointing out that at least Solaris allows at +most 16 iovecs in a single writev()... which probably means every sysv +derived system is similarly limited. linux sets the limit at 1024. +freebsd has an optimisation for up to 8, but otherwise handles 1024. + +i'm still doing work in this area though -- after all my ranting about +zero-copy a few weeks back i set out to prove myself wrong by writing +a zero-copy buffering library using every trick in my book. i've no +results to share yet though. + +-dean + + +============================== +Date: Tue, 2 May 2000 15:51:30 +0200 +From: Martin Kraemer +To: new-httpd@apache.org +Subject: BUFF, IOL, Chunking, and Unicode in 2.0 (long) +Message-ID: <20000502155129.A10548@pgtm0035.mch.sni.de> + +Sorry for a long silence in the past weeks, I've been busy with other +stuff. + +Putting the catch-words "Chunking, Unicode and 2.0" into the subject +was on purpose: I didn't want to scare off anyone because of the word +EBCDIC: the problems I describe here, and the proposed new buff.c +layering, are mostly independent from the EBCDIC port. + + +In the past weeks, I've been thinking about today's buff.c (and +studied its applicability for automatic conversion stuff like in the +russian apache, see apache.lexa.ru). I think it would be neat to be +able to do automatic character set conversion in the server, for +example by negotiation (when the client sends an Accept-Charset and +the server doesn't have a document with exactly the right Charset, but +knows how to generate it from an existing representation). + +IMO it is a reoccurring problem, + +* not only in today's russian internet environment (de facto browsers + support 5 different cyrillic character sets, but the server doesn't + want to hold every document in 5 copies, so an automatic translation + is performed by the russian apache, depending on information supplied + by the client, or by explicit configuration). One of the supported + character sets is Unicode (UTF-7 or UTF-8) + +* in japanese/chinese environments, support for 16 bit character sets + is an absolute requirement. (Other oriental scripts like Thai get + along with 8 bit: they only have 44 consonants and 16 vowels). + Having success on the eastern markets depends to a great deal on + having support for these character sets. The japanese Apache + community hasn't had much contact with new-httpd in the past, but + I'm absolutely sure that there is a "standard japanese patch" for + Apache which would well be worth integrating into the standard + distribution. (Anyone on the list to provide a pointer?) + +* In the future, more and more browsers will support unicode, and so + will the demand grow for servers supporting unicode. Why not + integrate ONE solution for the MANY problems worldwide? + +* The EBCDIC port of 1997 has been a simple solution for a rather + simple problem. If we would "do it right" for 2.0 and provide a + generic translation layer, we would solve many problems in a single + blow. The EBCDIC translation would be only one of them. + +Jeff has been digging through the EBCDIC stuff and apparently +succeeded in porting a lot of the 1.3 stuff to 2.0 already. Jeff, I'd +sure be interested in having a look at it. However, when I looked at +buff.c and the new iol_* functionality, I found out that iol's are not +the way to go: they give us no solution for any of the conversion +problems: + +* iol's sit below BUFF. Therefore, they don't have enough information + to know which part of the written byte stream is net client data, + and which part is protocol information (chunks, MIME headers for + multipart/*). + +* iol's don't allow simplification of today's chunking code. It is + spread thruout buff.c and there's a very hairy balance between + efficiency and code correctness. Re-adding (EBCDIC/UTF) conversion, + possibly with sup[port for multi byte character sets (MBCS), would + make a code nightmare out of it. (buff.c in 1.3 was "almost" a + nightmare because we had onlu single byte translations. + +* Putting conversion to a hierarchy level any higher than buff.c is no + solution either: for chunks, as well as for multipart headers and + buffering boundaries, we need character set translation. Pulling it + to a higher level means that a lot of redundant information has to + be passed down and up. + +In my understanding, we need a layered buff.c (which I number from 0 +upwards): + +0) at the lowest layer, there's a "block mode" which basically + supports bread/bwrite/bwritev by calling the equivalent iol_* + routines. It doesn't know about chunking, conversion, buffering and + the like. All it does is read/write with error handling. + +1) the next layer handles chunking. It knows about the current + chunking state and adds chunking information into the written + byte stream at appropriate places. It does not need to know about + buffering, or what the current (ebcdic?) conversion setting is. + +2) this layer handles conversion. I was thinking about a concept + where a generic character set conversion would be possible based on + Unicode-to-any translation tables. This would also deal with + multibyte character sets, because at this layer, it would + be easy to convert SBCS to MBCS. + Note that conversion *MUST* be positioned above the chunking layer + and below the buffering layer. The former guarantees that chunking + information is not converted twice (or not at all), and the latter + guarantees that ap_bgets() is looking at the converted data + (-- otherwise it would fail to find the '\n' which indicates end- + of-line). + Using (loadable?) translation tables based on unicode definitions + is a very similar approach to what libiconv offers you (see + http://clisp.cons.org/~haible/packages-libiconv.html -- though my + inspiration came from the russian apache, and I only heard about + libiconv recently). Every character set can be defined as a list + of pairs, and translations between + several SBCS's can be collapsed into a single 256 char table. + Efficiently building them once only, and finding them fast is an + optimization task. + +3) This last layer adds buffering to the byte stream of the lower + layers. Because chunking and translation have already been dealt + with, it only needs to implement efficient buffering. Code + complexity is reduced to simple stdio-like buffering. + + +Creating a BUFF stream involves creation of the basic (layer 0) BUFF, +and then pushing zero or more filters (in the right order) on top of +it. Usually, this will always add the chunking layer, optionally add +the conversion layer, and usually add the buffering layer (look for +ap_bcreate() in the code: it almost always uses B_RD/B_WR). + +Here's code from a conceptual prototype I wrote: + BUFF *buf = ap_bcreate(NULL, B_RDWR), *chunked, *buffered; + chunked = ap_bpush_filter(buf, chunked_filter, 0); + buffered = ap_bpush_filter(chunked, buffered_filter, B_RDWR); + ap_bputs("Data for buffered ap_bputs\n", buffered); + + +Using a BUFF stream doesn't change: simply invoke the well known API +and call ap_bputs() or ap_bwrite() as you would today. Only, these +would be wrapper macros + + #define ap_bputs(data, buf) buf->bf_puts(data, buf) + #define ap_write(buf, data, max, lenp) buf->bf_write(buf, data, max, lenp) + +where a BUFF struct would hold function pointers and flags for the +various levels' input/output functions, in addition to today's BUFF +layout. + +For performance improvement, the following can be added to taste: + +* fewer buffering (zero copy where possible) by putting the buffers + for buffered reading/writing down as far as possible (for SBCS: from + layer 3 to layer 0). By doing this, the buffer can also hold a + chunking prefix (used by layer 1) in front of the buffering buffer + to reduce the number of vectors in a writev, or the number of copies + between buffers. Each layer could indicate whether it needs a + private buffer or not. + +* intra-module calls can be hardcoded to call the appropriate lower + layer directly, instead of using the ap_bwrite() etc macros. That + means we don't use the function pointers all the time, but instead + call the lower levels directly. OTOH we have iol_* stuff which uses + function pointers anyway. We decided in 1.3 that we wanted to avoid + the C++ type stuff (esp. function pointers) for performance reasons. + But it would sure reduces the code complexity a lot. + +The resulting layering would look like this: + + | Caller: using ap_bputs() | or ap_bgets/apbwrite etc. + +--------------------------+ + | Layer 3: Buffered I/O | gets/puts/getchar functionality + +--------------------------+ + | Layer 2: Code Conversion | (optional conversions) + +--------------------------+ + | Layer 1: Chunking Layer | Adding chunks on writes + +--------------------------+ + | Layer 0: Binary Output | bwrite/bwritev, error handling + +--------------------------+ + | iol_* functionality | basic i/o + +--------------------------+ + | apr_* functionality | + .... + +-- + | Fujitsu Siemens +Fon: +49-89-636-46021, FAX: +49-89-636-41143 | 81730 Munich, Germany + + +============================== +Date: Tue, 2 May 2000 09:09:28 -0700 (PDT) +From: dean gaudet +To: new-httpd@apache.org +Subject: Re: BUFF, IOL, Chunking, and Unicode in 2.0 (long) +In-Reply-To: <20000502155129.A10548@pgtm0035.mch.sni.de> +Message-ID: + +On Tue, 2 May 2000, Martin Kraemer wrote: + +> * iol's sit below BUFF. Therefore, they don't have enough information +> to know which part of the written byte stream is net client data, +> and which part is protocol information (chunks, MIME headers for +> multipart/*). + +there's not much stopping you from writing an iol which takes a BUFF * in +its initialiser, and then bcreating a second BUFF, and bpushing your iol. +like: + + /* this is in r->pool rather than r->connection->pool because + * we expect to create & destroy this inside request boundaries + * and if we stuck it in r->connection->pool the storage wouldn't + * be reclaimed earlier enough on pipelined connections. + * + * also, no need for buffering in new_buff because the translation + * layer can easily assume lower level BUFF is doing the buffering. + */ + new_buff = ap_bcreate(r->pool, B_WR); + ap_bpush_iol(new_buff, + ap_utf8_to_ebcdic(r->pool, r->connection->client)); + r->connection->client = new_buff; + +main problem is that the new_buff only works for writing, and you +potentially need a separate conversion layer for reading from the +client. + +shouldn't be too hard to split up r->connection->client into a read and +write half. + +think of iol as the equivalent of the low level read/write, and BUFF +as the equivalent of FILE *. there's a reason for both layers in +the interface. + +> * iol's don't allow simplification of today's chunking code. It is +> spread thruout buff.c and there's a very hairy balance between +> efficiency and code correctness. Re-adding (EBCDIC/UTF) conversion, +> possibly with sup[port for multi byte character sets (MBCS), would +> make a code nightmare out of it. (buff.c in 1.3 was "almost" a +> nightmare because we had onlu single byte translations. + +as i've said before, i welcome anyone to do it otherwise without adding +network packets, without adding unnecessary byte copies, and without +making it even more complex. until you've tried it, it's pretty easy +to just say "this is a mess". once you've tried it i suspect you'll +discover why it is a mess. + +that said, i'm still trying to prove to myself that the zero-copy +crud necessary to clean this up can be done in a less complex manner. + +> * Putting conversion to a hierarchy level any higher than buff.c is no +> solution either: for chunks, as well as for multipart headers and +> buffering boundaries, we need character set translation. Pulling it +> to a higher level means that a lot of redundant information has to +> be passed down and up. + +huh? HTTP is in ASCII -- you don't need any conversion -- if a chunking +BUFF below a converting BUFF/iol is writing those things in ascii +it works. no? at least that's my understanding of the code in 1.3. + +you wouldn't do the extra BUFF layer above until after you've written +the headers into the plain-text BUFF. + +i would expect you'd: + + write headers through plain text BUFF + push conversion BUFF + run method + pop conversion BUFF + pump multipart header + push conversion BUFF + ... + pop conversion BUFF + +> In my understanding, we need a layered buff.c (which I number from 0 +> upwards): + +you've already got it :) + +> | Caller: using ap_bputs() | or ap_bgets/apbwrite etc. +> +--------------------------+ +> | Layer 3: Buffered I/O | gets/puts/getchar functionality +> +--------------------------+ +> | Layer 2: Code Conversion | (optional conversions) +> +--------------------------+ +> | Layer 1: Chunking Layer | Adding chunks on writes +> +--------------------------+ +> | Layer 0: Binary Output | bwrite/bwritev, error handling +> +--------------------------+ +> | iol_* functionality | basic i/o +> +--------------------------+ +> | apr_* functionality | + +there are two cases you need to consider: + +chunking and a partial write occurs -- you need to keep track of how much +of the chunk header/trailer was written so that on the next loop around +(which happens in the application at the top) you continue where you +left off. + +and more importantly at the moment, and easier to grasp -- consider what +happens when you've got a pipelined connection. a dozen requests come +in from the client, and apache-1.3 will send back the minimal number +of packets. 2.0-current still needs fixing in this area (specifically +saferead needs to be implemented). + +for example, suppose the client sends one packet: + + GET /images/a.gif HTTP/1.1 + Host: foo + + GET /images/b.gif HTTP/1.1 + Host: foo + +suppose that a.gif and b.gif are small 200 byte files. + +apache-1.3 sends back one response packet: + + HTTP/1.1 OK + headers + + a.gif body + HTTP/1.1 OK + headers + + b.gif body + +consider what happens with your proposal. in between each of those +requests you remove the buffering -- which means you have to flush a +packet boundary. so your proposal generates two network packets. + +like i've said before on this topic -- if all unixes had TCP_CORK, +it'd be a breeze. but only linux has TCP_CORK. + +you pretty much require a layer of buffering right above the iol which +talks to the network. + +and once you put that layer of buffering there, you might as well merge +chunking into it, because chunking needs buffering as well (specifically +for the async i/o case). + +and then you either have to double-buffer, or you can only stack +non-buffered layers above it. fortunately, character-set conversion +should be doable without any buffering. + +*or* you implement a zero-copy library, and hope it all works out in +the end. + +-dean + Index: ossp-pkg/sio/BRAINSTORM/doc_greg_filters.txt RCS File: /v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/doc_greg_filters.txt,v co -q -kk -p'1.1' '/v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/doc_greg_filters.txt,v' | diff -u /dev/null - -L'ossp-pkg/sio/BRAINSTORM/doc_greg_filters.txt' 2>/dev/null --- ossp-pkg/sio/BRAINSTORM/doc_greg_filters.txt +++ - 2024-05-20 02:18:02.379010553 +0200 @@ -0,0 +1,102 @@ +Date: Fri, 14 Apr 2000 13:46:50 -0700 (PDT) +From: Greg Stein +To: new-httpd@apache.org +Subject: Re: I/O filtering in 2.0 +In-Reply-To: +Message-ID: + +On Fri, 14 Apr 2000 rbb@covalent.net wrote: +> I am not calling this I/O Layering, because this is really output +> filtering. The patch I am submitting allows modules to edit data after a +> handler has finished with it. This is basically Greg's approach. + +I'll detail my approach here, as your patch has some pieces, but it is +quite different. + +All of this is obviously IMO... + + +*) we definitely want multiple output filters. each filter is recorded in + a linked list in the request_rec. + +*) a filter has a name and is implemented by a module. this mapping is set + up similarly to handler maps in the 'module' structure. + +*) output from normal modules is identical to today. they use ap_rputs, + ap_rwrite, etc. Filtering occurs under the covers. + +*) Apache defines ap_lwrite(ap_layer *next_layer, + const void *buf, size_t len, + request_rec *r) + and possibly some similar ones for printf, puts, etc + +*) struct ap_layer_s { + const char *layer_name; + layer_func_t *func; + struct ap_layer_s *next; + } + + /* filters implement function with this type: */ + typedef ap_status_t (*layer_func_t)(ap_layer *next_layer, + const void *buf, size_t len, + request_rec *r); + /* ### dunno about that return type */ + /* looks remarkably similar to ap_lwrite(), eh? */ + +*) ap_status_t ap_lwrite(ap_layer *layer, const void *buf, + size_t len, request_rec *r) + { + if (layer == NULL) { + ap_bwrite(r->connection->client, buf, len, &amt); + return OK; + } + return (*layer->func)(layer->next, buf, len, r); + } + +*) a new Apache directive can detail the sequence of filters and install + them into the request_rec. + +*) ap_rwrite() and friends calls ap_lwrite(r->first_layer, ...). this will + perform actual output filtering, or go off to the BUFF stuff. + +*) a new hook is added: install_filters. it is called right before + invoke_handlers and is responsible for setting r->first_layer and/or + elements along the list. + +*) a new, small module can implement a directive which responds to + install_filters and sets up a sequence of filters based on their names. + for example: + SetFilters PHP SSI + +*) content handlers (e.g. during invoke_handler processing) have a new + function to call: ap_set_content_type(r, const char *type). when the + type is changed, such as during CGI processing, this function is called + and an opportunity (somehow? haven't thought on this part) is provided + for new output layers to be inserted. + [ this provides for a CGI output'ing application/x-httpd-php3 ] + + ap_set_content_type() should probably know where it is during the + request processing so that it can be used any time. maybe it should be + allowed to set up layers at any time? + + +That's it. :-) + +Helper functions to set up a pipe and a sub-thread would be handy. That +would allow some modules to keep their "read from an fd" approach, rather +than switching to a stateful parser approach. As Dean stated before, +output filtering is necessarily asynchronous: a sub thread or a state +machine thingy is required. + +[ flipping things around, you could say that the initial content can be + generated asynchronously (where the first filter demands the next chunk + of output). this would be incredibly difficult for things like + mod_autoindex. at some point, somebody is pulling content and shoving it + down the BUFF. the above form is "everybody shoves content" ] + +Cheers, +-g + +-- +Greg Stein, http://www.lyra.org/ + Index: ossp-pkg/sio/BRAINSTORM/doc_page_io.txt RCS File: /v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/doc_page_io.txt,v co -q -kk -p'1.1' '/v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/doc_page_io.txt,v' | diff -u /dev/null - -L'ossp-pkg/sio/BRAINSTORM/doc_page_io.txt' 2>/dev/null --- ossp-pkg/sio/BRAINSTORM/doc_page_io.txt +++ - 2024-05-20 02:18:02.381565743 +0200 @@ -0,0 +1,166 @@ + +From dgaudet@arctic.org Fri Feb 20 00:36:52 1998 +Date: Fri, 20 Feb 1998 00:35:37 -0800 (PST) +From: Dean Gaudet +To: new-httpd@apache.org +Subject: page-based i/o +X-Comment: Visit http://www.arctic.org/~dgaudet/legal for information regarding copyright and disclaimer. +Reply-To: new-httpd@apache.org + +Ed asked me for more details on what I mean when I talk about "paged based +zero copy i/o". + +While writing mod_mmap_static I was thinking about the primitives that the +core requires of the filesystem. What exactly is it that ties us into the +filesystem? and how would we abstract it? The metadata (last modified +time, file length) is actually pretty easy to abstract. It's also easy to +define an "index" function so that MultiViews and such can be implemented. +And with layered I/O we can hide the actual details of how you access +these "virtual" files. + +But therein lies an inefficiency. If we had only bread() for reading +virtual files, then we would enforce at least one copy of the data. +bread() supplies the place that the caller wants to see the data, and so +the bread() code has to copy it. But there's very little reason that +bread() callers have to supply the buffer... bread() itself could supply +the buffer. Call this new interface page_read(). It looks something like +this: + + typedef struct { + const void *data; + size_t data_len; /* amt of data on page which is valid */ + ... other stuff necessary for managing the page pool ... + } a_page_head; + + /* returns NULL if an error or EOF occurs, on EOF errno will be + * set to 0 + */ + a_page_head *page_read(BUFF *fb); + + /* queues entire page for writing, returns 0 on success, -1 on + * error + */ + int page_write(BUFF *fb, a_page_head *); + +It's very important that a_page_head structures point to the data page +rather than be part of the data page. This way we can build a_page_head +structures which refer to parts of mmap()d memory. + +This stuff is a little more tricky to do, but is a big win for performance. +With this integrated into our layered I/O it means that we can have +zero-copy performance while still getting the advantages of layering. + +But note I'm glossing over a bunch of details... like the fact that we +have to decide if a_page_heads are shared data, and hence need reference +counting (i.e. I said "queues for writing" up there, which means some +bit of the a_page_head data has to be kept until its actually written). +Similarly for the page data. + +There are other tricks in this area that we can take advantage of -- +like interprocess communication on architectures that do page flipping. +On these boxes if you write() something that's page-aligned and page-sized +to a pipe or unix socket, and the other end read()s into a page-aligned +page-sized buffer then the kernel can get away without copying any data. +It just marks the two pages as shared copy-on-write, and only when +they're written to will the copy be made. So to make this work, your +writer uses a ring of 2+ page-aligned/sized buffers so that it's not +writing on something the reader is still reading. + +Dean + +---- + +For details on HPUX and avoiding extra data copies, see +. + +(note that if you get the postscript version instead, you have to +manually edit it to remove the front page before any version of +ghostscript that I have used will read it) + +---- + +I've been told by an engineer in Sun's TCP/IP group that zero-copy TCP +in Solaris 2.6 occurs when: + + - you've got the right interface card (OC-12 ATM card I think) + - you use write() + - your write buffer is 16k aligned and a multiple of 16k in size + +We currently get the 16k stuff for free by using mmap(). But sun's +current code isn't smart enough to deal with our initial writev() +of the headers and first part of the response. + +---- + +Systems that have a system call to efficiently send the contents of a +descriptor across the network. This is probably the single best way +to do static content on systems that support it. + +HPUX: (10.30 and on) + + ssize_t sendfile(int s, int fd, off_t offset, size_t nbytes, + const struct iovec *hdtrl, int flags); + + (allows you to add headers and trailers in the form of iovec + structs) Marc has a man page; ask if you want a copy. Not included + due to copyright issues. man page also available from + http://docs.hp.com/ (in particular, + http://docs.hp.com:80/dynaweb/hpux11/hpuxen1a/rvl3en1a/@Generic__BookTextView/59894;td=3 ) + +Windows NT: + + BOOL TransmitFile( SOCKET hSocket, + HANDLE hFile, + DWORD nNumberOfBytesToWrite, + DWORD nNumberOfBytesPerSend, + LPOVERLAPPED lpOverlapped, + LPTRANSMIT_FILE_BUFFERS lpTransmitBuffers, + DWORD dwFlags + ); + + (does it start from the current position in the handle? I would + hope so, or else it is pretty dumb.) + + lpTransmitBuffers allows for headers and trailers. + + Documentation at: + + http://premium.microsoft.com/msdn/library/sdkdoc/wsapiref_3pwy.htm + http://premium.microsoft.com/msdn/library/conf/html/sa8ff.htm + + Even less related to page based IO: just context switching: + AcceptEx does an accept(), and returns the start of the + input data. see: + + http://premium.microsoft.com/msdn/library/sdkdoc/pdnds/sock2/wsapiref_17jm.htm + + What this means is you require one less syscall to do a + typical request, especially if you have a cache of handles + so you don't have to do an open or close. Hmm. Interesting + question: then, if TransmitFile starts from the current + position, you need a mutex around the seek and the + TransmitFile. If not, you are just limited (eg. byte + ranges) in what you can use it for. + + Also note that TransmitFile can specify TF_REUSE_SOCKET, so that + after use the same socket handle can be passed to AcceptEx. + Obviously only good where we don't have a persistent connection + to worry about. + +---- + +Note that all this is shot to bloody hell by HTTP-NG's multiplexing. +If fragment sizes are big enough, it could still be worthwhile to +do copy avoidence. It also causes performance issues because of +its credit system that limits how much you can write in a single +chunk. + +Don't tell me that if HTTP-NG becomes popular we will seen vendors +embedding SMUX (or whatever multiplexing is used) in the kernel to +get around this stuff. There we go, Apache with a loadable kernel +module. + +---- + +Larry McVoy's document for SGI regarding sendfile/TransmitFile: +ftp://ftp.bitmover.com/pub/splice.ps.gz Index: ossp-pkg/sio/BRAINSTORM/doc_stacked_io.txt RCS File: /v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/doc_stacked_io.txt,v co -q -kk -p'1.1' '/v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/doc_stacked_io.txt,v' | diff -u /dev/null - -L'ossp-pkg/sio/BRAINSTORM/doc_stacked_io.txt' 2>/dev/null --- ossp-pkg/sio/BRAINSTORM/doc_stacked_io.txt +++ - 2024-05-20 02:18:02.384157278 +0200 @@ -0,0 +1,1312 @@ +[djg: comments like this are from dean] + +This past summer, Alexei and I wrote a spec for an I/O Filters API... +this proposal addresses one part of that -- 'stacked' I/O with buff.c. + +We have a couple of options for stacked I/O: we can either use existing +code, such as sfio, or we can rewrite buff.c to do it. We've gone over +the first possibility at length, though, and there were problems with each +implemenation which was mentioned (licensing and compatibility, +specifically); so far as I know, those remain issues. + +Btw -- sfio will be supported w/in this model... it just wouldn't be the +basis for the model's implementation. + + -- Ed Korthof | Web Server Engineer -- + -- ed@organic.com | Organic Online, Inc -- + -- (415) 278-5676 | Fax: (415) 284-6891 -- + +--------------------------------------------------------------------------- +Stacked I/O With BUFFs + Sections: + + 1.) Overview + 2.) The API + User-supplied structures + API functions + 3.) Detailed Description + The bfilter structure + The bbottomfilter structure + The BUFF structure + Public functions in buff.c + 4.) Efficiency Considerations + Buffering + Memory copies + Function chaining + writev + 5.) Code in buff.c + Default Functions + Heuristics for writev + Writing + Reading + Flushing data + Closing stacks and filters + Flags and Options + +************************************************************************* + Overview + +The intention of this API is to make Apache's BUFF structure modular +while retaining high efficiency. Basically, it involves rewriting +buff.c to provide 'stacked' I/O -- where the data passed through a +series of 'filters', which may modify it. + +There are two parts to this, the core code for BUFF structures, and the +"filters" used to implement new behavior. "filter" is used to refer to +both the sets of 5 functions, as shown in the bfilter structure in the +next section, and to BUFFs which are created using a specific bfliter. +These will also be occasionally refered to as "user-supplied", though +the Apache core will need to use these as well for basic functions. + +The user-supplied functions should use only the public BUFF API, rather +than any internal details or functions. One thing which may not be +clear is that in the core BUFF functions, the BUFF pointer passed in +refers to the BUFF on which the operation will happen. OTOH, in the +user-supplied code, the BUFF passed in is the next buffer down the +chain, not the current one. + +************************************************************************* + The API + + User-supplied structures + +First, the bfilter structure is used in all filters: + typedef struct { + int (*writev)(BUFF *, void *, struct iovect *, int); + int (*read)(BUFF *, void *, char *, int); + int (*write)(BUFF *, void *, const char *, int); + int (*flush)(BUFF *, void *, const char *, int, bfilter *); + int (*transmitfile)(BUFF *, void *, file_info_ptr *); + void (*close)(BUFF *, void *); + } bfilter; + +bfilters are placed into a BUFF structure along with a +user-supplied void * pointer. + +Second, the following structure is for use with a filter which can +sit at the bottom of the stack: + + typedef struct { + void *(*bgetfileinfo)(BUFF *, void *); + void (*bpushfileinfo)(BUFF *, void *, void *); + } bbottomfilter; + + + BUFF API functions + +The following functions are new BUFF API functions: + +For filters: + +BUFF * bcreatestack(pool *p, int flags, struct bfilter *, + struct bbottomfilter *, void *); +BUFF * bpushfilter (BUFF *, struct bfilter *, void *); +BUFF * bpushbuffer (BUFF *, BUFF *); +BUFF * bpopfilter(BUFF *); +BUFF * bpopbuffer(BUFF *); +void bclosestack(BUFF *); + +For BUFFs in general: + +int btransmitfile(BUFF *, file_info_ptr *); +int bsetstackopts(BUFF *, int, const void *); +int bsetstackflags(BUFF *, int, int); + +Note that a new flag is needed for bsetstackflags: +B_MAXBUFFERING + +The current bcreate should become + +BUFF * bcreatebuffer (pool *p, int flags, struct bfilter *, void *); + +************************************************************************* + Detailed Explanation + + bfilter structure + +The void * pointer used in all these functions, as well as those in the +bbottomfilter structure and the filter API functions, is always the same +pointer w/in an individual BUFF. + +The first function in a bfilter structure is 'writev'; this is only +needed for high efficiency writing, generally at the level of the system +interface. In it's absence, multiple writes will be done w/ 'write'. +Note that defining 'writev' means you must define 'write'. + +The second is 'write'; this is the generic writing function, taking a BUFF +* to which to write, a block of text, and the length of that block of +text. The expected return is the number of characters (out of that block +of text) which were successfully processed (rather than the number of +characters actually written). + +The third is 'read'; this is the generic reading function, taking a BUFF * +from which to read data, and a void * buffer in which to put text, and the +number of characters to put in that buffer. The expected return is the +number of characters placed in the buffer. + +The fourth is 'flush'; this is intended to force the buffer to spit out +any data it may have been saving, as well as to clear any data the +BUFF code was storing. If the third argument is non-null, then it +contains more text to be printed; that text need not be null terminated, +but the fourth argument contains the length of text to be processed. The +expected return value should be the number of characters handled out +from the third argument (0 if there are none), or -1 on error. Finally, +the fifth argument is a pointer to the bfilter struct containing this +function, so that it may use the write or writev functions in it. Note +that general buffering is handled by BUFF's internal code, and module +writers should not store data for performance reasons. + +The fifth is 'transmitfile', which takes as its arguments a buffer to +which to write (if non-null), the void * pointer containing configuration +(or other) information for this filter, and a system-dependent pointer +(the file_info_ptr structure will be defined on a per-system basis) +containing information required to print the 'file' in question. +This is intended to allow zero-copy TCP in Win32. + +The sixth is 'close'; this is what is called when the connection is being +closed. The 'close' should not be passed on to the next filter in the +stack. Most filters will not need to use this, but if database handles +or some other object is created, this is the point at which to remove it. +Note that flush is called automatically before this. + + bbottomfilter Structure + +The first function, bgetfileinfo, is designed to allow Apache to get +information from a BUFF struct regarding the input and output sources. +This is currently used to get the input file number to select on a +socket to see if there's data waiting to be read. The information +returned is platform specific; the void * pointer passed in holds +the void * pointer passed to all user-supplied functions. + +The second function, bpushfileinfo, is used to push file information +onto a buffer, so that the buffer can be fully constructed and ready +to handle data as soon as possible after a client has connected. +The first void * pointer holds platform specific information (in +Unix, it would be a pair of file descriptors); the second holds the +void * pointer passed to all user-supplied functions. + +[djg: I don't think I really agree with the distinction here between +the bottom and the other filters. Take the select() example, it's +valid for any layer to define a fd that can be used for select... +in fact it's the topmost layer that should really get to make this +definition. Or maybe I just have your top and bottom flipped. In +any event I think this should be part of the filter structure and +not separate.] + + The BUFF structure + +A couple of changes are needed for this structure: remove fd and +fd_in; add a bfilter structure; add a pointer to a bbottomfilter; +add three pointers to the next BUFFs: one for the next BUFF in the +stack, one for the next BUFF which implements write, and one +for the next BUFF which implements read. + + + Public functions in buff.c + +BUFF * bpushfilter (BUFF *, struct bfilter *, void *); + +This function adds the filter functions from bfilter, stacking them on +top of the BUFF. It returns the new top BUFF, or NULL on error. + +BUFF * bpushbuffer (BUFF *, BUFF *); + +This function places the second buffer on the top of the stack that +the first one is on. It returns the new top BUFF, or NULL on error. + +BUFF * bpopfilter(BUFF *); +BUFF * bpopbuffer(BUFF *); + +Unattaches the top-most filter from the stack, and returns the new +top-level BUFF, or NULL on error or when there are no BUFFs +remaining. The two are synonymous. + +void bclosestack(BUFF *); + +Closes the I/O stack, removing all the filters in it. + +BUFF * bcreatestack(pool *p, int flags, struct bfilter *, + struct bbottomfilter *, void *); + +This creates an I/O stack. It returns NULL on error. + +BUFF * bcreatebuffer(pool *p, int flags, struct bfilter *, void *); + +This creates a BUFF for later use with bpushbuffer. The BUFF is +not set up to be used as an I/O stack, however. It returns NULL +on error. + +int bsetstackopts(BUFF *, int, const void *); +int bsetstackflags(BUFF *, int, int); + +These functions, respectively, set options on all the BUFFs in a +stack. The new flag, B_MAXBUFFERING is used to disable a feature +described in the next section, whereby only the first and last +BUFFs will buffer data. + +************************************************************************* + Efficiency Considerations + + Buffering + +All input and output is buffered by the standard buffering code. +People writing code to use buff.c should not concern themselves with +buffering for efficiency, and should not buffer except when necessary. + +The write function will typically be called with large blocks of text; +the read function will attempt to place the specified number of bytes +into the buffer. + +Dean noted that there are possible problems w/ multiple buffers; +further, some applications must not be buffered. This can be +partially dealt with by turning off buffering, or by flushing the +data when appropriate. + +However, some potential problems arise anyway. The simplest example +involves shrinking transformations; suppose that you have a set +of filters, A, B, and C, such that A outputs less text than it +recieves, as does B (say A strips comments, and B gzips the result). +Then after a write to A which fills the buffer, A writes to B. +However, A won't write enough to fill B's buffer, so a memory copy +will be needed. This continues till B's buffer fills up, then +B will write to C's buffer -- with the same effect. + +[djg: I don't think this is the issue I was really worried about -- +in the case of shrinking transformations you are already doing +non-trivial amounts of CPU activity with the data, and there's +no copying of data that you can eliminate anyway. I do recognize +that there are non-CPU intensive filters -- such as DMA-capable +hardware crypto cards. I don't think they're hard to support in +a zero-copy manner though.] + +The maximum additional number of bytes which will be copied in this +scenario is on the order of nk, where n is the total number of bytes, +and k is the number of filters doing shrinking transformations. + +There are several possible solutions to this issue. The first +is to turn off buffering in all but the first filter and the +last filter. This reduces the number of unnecessary byte copies +to at most one per byte, however it means that the functions in +the stack will get called more frequently; but it is the default +behavior, overridable by setting the B_MAXBUFFERING with +bsetstackflags. Most filters won't involve a net shrinking +transformation, so even this will rarely be an issue; however, +if the filters do involve a net shrinking transformation, for +the sake of network-efficiency (sending reasonably sized blocks), +it may be more efficient anyway. + +A second solution is more general use of writev for communication +between different buffers. This complicates the programing work, +however. + + + Memory copies + +Each write function is passed a pointer to constant text; if any changes +are being made to the text, it must be copied. However, if no changes +are made to the text (or to some smaller part of it), then it may be +sent to the next filter without any additional copying. This should +provide the minimal necessary memory copies. + +[djg: Unfortunately this makes it hard to support page-flipping and +async i/o because you don't have any reference counts on the data. +But I go into a little detail that already in docs/page_io.] + + Function chaining + +In order to avoid unnecessary function chaining for reads and writes, +when a filter is pushed onto the stack, the buff.c code will determine +which is the next BUFF which contains a read or write function, and +reads and writes, respectively, will go directly to that BUFF. + + writev + +writev is a function for efficient writing to the system; in terms of +this API, however, it also works for dealing with multiple blocks of +text without doing unnecessary byte copies. It is not required. + +Currently, the system level writev is used in two contexts: for +chunking and when a block of text is writen which, combined with +the text already in the buffer, would make the buffer overflow. + +writev would be implemented both by the default bottom level filter +and by the chunking filter for these operations. In addition, writev +may, be used, as noted above, to pass multiple blocks of text w/o +copying them into a single buffer. Note that if the next filter does +not implement writev, however, this will be equivalent to repeated +calls to write, which may or may not be more efficient. Up to +IOV_MAX-2 blocks of text may be passed along in this manner. Unlike +the system writev call, the writev in this API should be called only +once, with a array with iovec's and a count as to the number of +iovecs in it. + +If a bfilter defines writev, writev will be called whether or not +NO_WRITEV is set; hence, it should deal with that case in a reasonable +manner. + +[djg: We can't guarantee atomicity of writev() when we emulate it. +Probably not a problem, just an observation.] + +************************************************************************* + Code in buff.c + + Default Functions + +The default actions are generally those currently performed by Apache, +save that they they'll only attempt to write to a buffer, and they'll +return an error if there are no more buffers. That is, you must implement +read, write, and flush in the bottom-most filter. + +Except for close(), the default code will simply pass the function call +on to the next filter in the stack. Some samples follow. + + Heuristics for writev + +Currently, we call writev for chunking, and when we get a enough so that +the total overflows the buffer. Since chunking is going to become a +filter, the chunking filter will use writev; in addition, bwrite will +trigger bwritev as shown (note that system specific information should +be kept at the filter level): + +in bwrite: + + if (fb->outcnt > 0 && nbyte + fb->outcnt >= fb->bufsiz) { + /* build iovec structs */ + struct iovec vec[2]; + vec[0].iov_base = (void *) fb->outbase; + vec[0].iov_len = fb->outcnt; + fb->outcnt = 0; + vec[1].iov_base = (void *)buff; + vec[1].iov_length = nbyte; + return bwritev (fb, vec, 2); + } else if (nbye >= fb->bufsiz) { + return write_with_errors(fb,buff,nbyte); + } + +Note that the code above takes the place of large_write (as well +as taking code from it). + +So, bwritev would look something like this (copying and pasting freely +from the current source for writev_it_all, which could be replaced): + +----- +int bwritev (BUFF * fb, struct iovec * vec, int nvecs) { + if (!fb) + return -1; /* the bottom level filter implemented neither write nor + * writev. */ + if (fb->bfilter.bwritev) { + return bf->bfilter.writev(fb->next, vec, nvecs); + } else if (fb->bfilter.write) { + /* while it's nice an easy to build the vector and crud, it's painful + * to deal with partial writes (esp. w/ the vector) + */ + int i = 0,rv; + while (i < nvecs) { + do { + rv = fb->bfilter.write(fb, vec[i].iov_base, vec[i].iov_len); + } while (rv == -1 && (errno == EINTR || errno == EAGAIN) + && !(fb->flags & B_EOUT)); + if (rv == -1) { + if (errno != EINTR && errno != EAGAIN) { + doerror (fb, B_WR); + } + return -1; + } + fb->bytes_sent += rv; + /* recalculate vec to deal with partial writes */ + while (rv > 0) { + if (rv < vec[i].iov_len) { + vec[i].iov_base = (char *)vec[i].iov_base + rv; + vec[i].iov_len -= rv; + rv = 0; + if (vec[i].iov_len == 0) { + ++i; + } + } else { + rv -= vec[i].iov_len; + ++i; + } + } + if (fb->flags & B_EOUT) + return -1; + } + /* if we got here, we wrote it all */ + return 0; + } else { + return bwritev(fb->next,vec,nvecs); + } +} +----- +The default filter's writev function will pretty much like +writev_it_all. + + + Writing + +The general case for writing data is significantly simpler with this +model. Because special cases are not dealt with in the BUFF core, +a single internal interface to writing data is possible; I'm going +to assume it's reasonable to standardize on write_with_errors, but +some other function may be more appropriate. + +In the revised bwrite (which I'll ommit for brievity), the following +must be done: + check for error conditions + check to see if any buffering is done; if not, send the data + directly to the write_with_errors function + check to see if we should use writev or write_with_errors + as above + copy the data to the buffer (we know it fits since we didn't + need writev or write_with_errors) + +The other work the current bwrite is doing is + ifdef'ing around NO_WRITEV + numerous decisions regarding whether or not to send chunks + +Generally, buff.c has a number of functions whose entire purpose is +to handle particular special cases wrt chunking, all of which could +be simplified with a chunking filter. + +write_with_errors would not need to change; buff_write would. Here +is a new version of it: + +----- +/* the lowest level writing primitive */ +static ap_inline int buff_write(BUFF *fb, const void *buf, int nbyte) +{ + if (fb->bfilter.write) + return fb->bfilter.write(fb->next_writer,buff,nbyte); + else + return bwrite(fb->next_writer,buff,nbyte); +} +----- + +If the btransmitfile function is called on a buffer which doesn't implement +it, the system will attempt to read data from the file identified +by the file_info_ptr structure and use other methods to write to it. + + Reading + +One of the basic reading functions in Apache 1.3b3 is buff_read; +here is how it would look within this spec: + +----- +/* the lowest level reading primitive */ +static ap_inline int buff_read(BUFF *fb, void *buf, int nbyte) +{ + int rv; + + if (!fb) + return -1; /* the bottom level filter is not set up properly */ + + if (fb->bfilter.read) + return fb->bfilter.read(fb->next_reader,buf,nbyte,fb->bfilter_info); + else + return bread(fb->next_reader,buff,nbyte); +} +----- +The code currently in buff_read would become part of the default +filter. + + + Flushing data + +flush will get passed on down the stack automatically, with recursive +calls to bflush. The user-supplied flush function will be called then, +and also before close is called. The user-supplied flush should not +call flush on the next buffer. + +[djg: Poorly written "expanding" filters can cause some nastiness +here. In order to flush a layer you have to write out your current +buffer, and that may cause the layer below to overflow a buffer and +flush it. If the filter is expanding then it may have to add more to +the buffer before flushing it to the layer below. It's possible that +the layer below will end up having to flush twice. It's a case where +writev-like capabilities are useful.] + + Closing Stacks and Filters + +When a filter is removed from the stack, flush will be called then close +will be called. When the entire stack is being closed, this operation +will be done automatically on each filter within the stack; generally, +filters should not operate on other filters further down the stack, +except to pass data along when flush is called. + + Flags and Options + +Changes to flags and options using the current functions only affect +one buffer. To affect all the buffers on down the chain, use +bsetstackopts or bsetstackflags. + +bgetopt is currently only used to grab a count of the bytes sent; +it will continue to provide that functionality. bgetflags is +used to provide information on whether or not the connection is +still open; it'll continue to provide that functionality as well. + +The core BUFF operations will remain, though some operations which +are done via flags and options will be done by attaching appropriate +filters instead (eg. chunking). + +[djg: I'd like to consider filesystem metadata as well -- we only need +a few bits of metadata to do HTTP: file size and last modified. We +need an etag generation function, it is specific to the filters in +use. You see, I'm envisioning a bottom layer which pulls data out of +a database rather than reading from a file.] + + +************************************************************** +************************************************************** +Date: Wed, 9 Sep 1998 18:55:40 -0700 (PDT) +From: Alexei Kosut +To: new-httpd@apache.org +Subject: A Magic Cache example +Message-ID: + +During the drive home, I came up with a good example of how I envision the +new module/cache/layer model thingy working. Comments please: + +The middle end of the server is responsible for taking the request the +front end gives it and somehow telling the back end how to fulfill it. I +look at it like this: The request is a URI (Uniform Resource Identifier) +and a set of request dimensions (the request headers, the remote IP +address, the time of day, etc...). The middle end, via its configuration, +translates this into a request for content from a backing store module, +plus possibly some filter modules. Since the term "filename" is too +flat-file specific, let's call the parameter we pass to the backing store +a SRI (Specific Resource Identifier), in a format specific to that module. + +Our example is similar to the one I was using earlier, with some +additions: The request is for a URI, say "/skzb/teckla.html". The response +is a lookup from a (slow) database. The URI maps to the mod_database SRI +of "BOOK:0-441-7997-9" (I made that format up). We want to take that +output and convert it from whatever charset it's in into Unicode. We then +have a PHP script that works on a Unicode document and does things based +on whether the browser is Netscape or not. Then we translate the document +to the best charset that matches the characters used and the client's +capabilities and send it. + +So upon request for /skzb/teckla.html, the middle end translates the +request into the following "equation": + + SRI: mod_database("BOOK:0-441-7997-9") + + filter: mod_charset("Unicode") + + filter: mod_php() + + fllter: mod_charset("best_fit") + ------------------------------------------------- + URI: /skzb/teckla.html + +It then constructs a stack of IO (NSPR) filters like this: + +mod_database -> cache-write -> mod_charset -> cache-write -> mod_php -> +cache_write -> mod_charset -> cache-write -> client + +And sets it to running. Each of the cache filters is a write-through +filter that copies its data into the cache with a tag based on what +equation the middle end uses to get to it, plus the request dimensions it +uses (info it gets from the modules). + +The database access is stored under "SRI: mod_database(BOOK:0-441-79977-9" +with no dimensions (because it's the same for all requests). The first +charset manipulation is stored under "SRI: mod_database(BOOK...) + filter: +mod_charset(Unicode)", again with no dimensions. The PHP output is stored +under "SRI: mod_database(BOOK...) + filter: mod_charset(Unicode) + filter: +mod_php()" with dimesions of (User-Agent). The final output is stored both +as "SRI: mod_database(BOOK...) + filter: mod_charset(Unicode) + filter: +mod_php() + filter: mod_charset(best_fit)" and "URI: /skzb/teckla.html" +(they're the same thing), both with dimensions of (User-Agent, +Accept-Charset). + +So far so good. Now, when another request for /skzb/teckla.html comes in, +the cache is consulted to see how much we can use. First, the URI is +looked up. This can be done by a kernel or other streamlined part of the +server. So "URI: /skzb/teckla.html" is looked up, and one entry pops out +with dimensions of (User-Agent, Accept-Charset). The user-agent and +accept-charset of the request are compared against the ones of the stored +entiry(ies). If one matches, it can be sent directly. + +If not, the server proceeds to look up "SRI: mod_database(BOOK...) + +filter: mod_charset(Unicode) + filter: mod_php()". If the request has a +different accept-charset, but the same user-agent, then this can be +reprocessed by mod_charset and used. Otherwise, the server proceeds back +to "SRI: mod_database(BOOK...) + filter: mod_charset(Unicode)", which will +match any request. There's probably some sort of cache invalidation +(expires, etc...) that happens eventually to result in a new database +lookup, but mostly, that very costly operation is avoided. + +I think I've made it out to be a bit more complicated than it is, with the +long equation strings mixed in there. But the above reflects my +understanding of how the new Apache 2.0 system should work. + +Note 1: The cache is smarter than I make it out here when it comes to +adding new entries. It should realize that, since the translation to +Unicode doesn't change or restrict the dimensions of the request, it +really is pointless to cache the original database lookup, since it will +always be translated in exactly the same manner. Knowing this, it will +only cache the Unicode version. + +Note 2: PHP probably doesn't work with Unicode. And there may not be a way +to identify a script as only acting on the User-Agent dimension. That's +not the point. + +Note 3: Ten bonus points to anyone who's read this far, and is the first +person to answer today's trivia question: What does the skzb referred to +in the example URI stand for? There's enough information in this mail to +figure it out (with some help from the Net), even if you don't know +offhand (though if you do, I'd be happier). + +-- Alexei Kosut + Stanford University, Class of 2001 * Apache * + + +************************************************************** +Message-ID: <19980922224326.A16219@aisa.fi.muni.cz> +Date: Tue, 22 Sep 1998 22:43:26 +0200 +From: Honza Pazdziora +To: new-httpd@apache.org +Subject: Re: I/O Layering in next version of Apache. +References: <19980922111627.19784.qmail@hyperreal.org> <3607D53A.1FF6D93@algroup.co.uk> <13831.55021.929560.977122@zap.ml.org> +In-Reply-To: <13831.55021.929560.977122@zap.ml.org>; from Ben Hyde on Tue, Sep 22, 1998 at 01:04:12PM -0400 + +> >Does anyone have a starting point for layered I/O? I know we kicked it + +Hello, + +there has been a thread on modperl mailing list recently about +problems we have with the current architecture. Some of the points +were: what requerements will be put on modules to be new I/O +compliant. I believe it's the Apache::SSI vs. Apache::SSIChain +difference between 1.3.* and 2.*. The first fetches the file _and_ +does the SSI, the second takes input from a different module that +either gets the HTML or runs the CGI or so, and processes its output. +Should all modules be capable of working on some other module's +output? Probably except those that actually go to disk or database for +the primary data. + +Randal's point was that output of any module could be processed, so +that no module should make any assumption whether it's sending data +directly to the browser or to some other module. This can be used both +for caching, but it also one of the things to get the filtering +transparent. + +Also, as Apache::GzipChain module shows, once you process the output, +you may need to modify the headers as well. I was hit by this when I +tried to convert between charsets, to send out those that the browsers +would understand. The Apache::Mason module shows that you can build +a page from pieces. Each of the pieces might have different +characteristics (charset, for example), so with each piece of code we +might need to have its own headers that describe it, or at least the +difference between the final (global) header-outs and its local. + +Sorry for bringing so much Perl module names in, but modperl is +currently a way to get some layered I/O done in 1.3.*, so I only have +practical experiance with it. + +Yours, + +------------------------------------------------------------------------ + Honza Pazdziora | adelton@fi.muni.cz | http://www.fi.muni.cz/~adelton/ + I can take or leave it if I please +------------------------------------------------------------------------ + +************************************************************** +Date: Wed, 23 Sep 1998 10:46:47 -0700 (PDT) +From: Dean Gaudet +To: new-httpd@apache.org +Subject: Re: I/O Layering in next version of Apache. +In-Reply-To: <36092F2D.BCC4E5C1@algroup.co.uk> +Message-ID: + +On Wed, 23 Sep 1998, Ben Laurie wrote: + +> Dean Gaudet wrote: +> > +> > On Wed, 23 Sep 1998, Ben Laurie wrote: +> > +> > > Is the simplest model that accomodates this actually just a stack +> > > (tree?) of webservers? Naturally, we wouldn't talk HTTP between the +> > > layers, but pass (header,content) pairs around (effectively). +> > > Interesting. +> > +> > We could just talk "compiled" HTTP -- using a parsed representation of +> > everything essentially. +> +> That's pretty much what I had in mind - but does it make sense? I have +> to admit, it makes a certain amount of sense to me, but I still have +> this nagging suspicion that there's a catch. + +We talked about this during the developers meeting earlier this summer... +while we were hiking, so I don't think there were any notes. + +I think it'd be a useful exercise to specify a few example applications we +want to be able to support, and then consider methods of implementing +those applications. Make the set as diverse and small as possible. I'll +take the easiest one :) + +- serve static content from arbitrary backing store (e.g. file, database) + +Once we flesh such a list out it may be easier to consider implementation +variations... + +I think it was Cliff who said it this way: in a multiple layer setup he +wants to be able to partition the layers across servers in an arbtrary +manner. For example, a proxy cache on one box which the world talks to, +and which backends to various other boxes for dynamic and static content. +Or maybe the static content is on the same server as the proxy. If this is +something we want to support then talking (a restricted form of) HTTP +between layers is interesting. + +Now we can all start worrying about performance ;) + +Dean + + +************************************************************** +Date: Wed, 23 Sep 1998 11:23:30 -0700 (PDT) +From: Alexei Kosut +To: new-httpd@apache.org +Subject: Re: I/O Layering in next version of Apache. +In-Reply-To: <36092F2D.BCC4E5C1@algroup.co.uk> +Message-ID: + +On Wed, 23 Sep 1998, Ben Laurie wrote: + +> > We could just talk "compiled" HTTP -- using a parsed representation of +> > everything essentially. +> +> That's pretty much what I had in mind - but does it make sense? I have +> to admit, it makes a certain amount of sense to me, but I still have +> this nagging suspicion that there's a catch. + +One important thing to note is that we want this server to be able to +handle non-HTTP requests. So using HTTP as the internal language (as we do +now) is not the way to go. What we talked about in SF was using a basic +set of key/value pairs to represent the metadata of the response. Which +would of course bear an uncanny resemblance to HTTP-style MIME headers... + +Certainly, and this is the point I think the originator of this thread +raised, each module layer (see the emails I sent a few weeks ago for more +details on how I see *that*) needs to provide both a content filter and a +metadata filter. Certainly a module that does encoding has to be able to +alter the headers to add a Content-Encoding, Transfer-Encoding, TE, or +what have you. Many module that does anything to the content will +want to add headers, and many others will need to alter the dimensions on +which the request is served, or what the parameters to those dimensions +are for the current request. The latter is absolutely vital for cacheing. + +The problem, as I see it, is this: Often, I suspect it will be the case +that the module does not know what metadata it will be altering (and how) +until after it has processed the request. i.e., a PHP script may not +discover what dimensions it uses (as we discussed earlier) until after it +has parsed the entire script. But if the module is functioning as an +in-place filter, that can cause massive headaches if we need the metadata +in a complete form *before* we sent the entity, as we do for HTTP. + +I'm not quite sure how to solve that problem. Anyone have any brilliant +ideas? + +(Note that for internal caching, we don't actually need the dimension data +until after the request, because we can alter the state of the cache at +any time, but if we want to place nice with HTTP and send Vary: headers +and such, we do need that information. I guess we could send Vary: +footers...) + +-- Alexei Kosut + Stanford University, Class of 2001 * Apache * + + +************************************************************** +Date: 23 Sep 1998 20:26:58 -0000 +Message-ID: <19980923202658.25736.qmail@zap.ml.org> +From: Ben Hyde +To: new-httpd@apache.org +Subject: Stacking up Response Handling +In-Reply-To: +References: <36092F2D.BCC4E5C1@algroup.co.uk> + + +Alexei Kosut writes: +>The problem, as I see it, is this: Often, I suspect it will be the case +>that the module does not know what metadata it will be altering (and how) +>until after it has processed the request. i.e., a PHP script may not +>discover what dimensions it uses (as we discussed earlier) until after it +>has parsed the entire script. But if the module is functioning as an +>in-place filter, that can cause massive headaches if we need the metadata +>in a complete form *before* we sent the entity, as we do for HTTP. +> +>I'm not quite sure how to solve that problem. Anyone have any brilliant +>ideas? + +This is the same as building a layout engine that incremental layout +but simpler since I doubt we'd want to allow for reflow. + +Sometimes you can send output right along, sometimes you have to wait. +I visualize the output as a tree/outline and as it is swept out a +stack holds the path to the leave. Handlers for the individual nodes +wait or proceed depending on if they can. + +It's pretty design with the pipeline consisting of this stack of +output transformers/generators. Each pipeline stage accepts a stream +of output_chunks. I think of these output_chunks as coming in plenty +of flavors, for example transmit_file, transmit_memory, etc. Some +pipeline stages might handle very symbolic chunks. For example +transmit_xml_tree might be handed to transform_xml_to_html stage in +the pipeline. + +I'm assuming the core server would have only a few kinds of pipeline +nodes, generate_response, generate_content_from_url_via_file_system, +generate_via_classic_module_api. Things like convert_char_set or +do_cool_transfer_encoding, could easily be loaded at runtime and +authored outside the core. That would be nice. + +For typical fast responses we wouldn't push much on this stack at +all. It might go something like this: Push generate_response node, +it selects an appropriate content generator by consulting the +module community and pushes that. Often this is +generate_content_from_url_via_file_system which in turn does +all that ugly mapping to a file name and then passes +transmit_file down the pipeline and pops it's self off the stack. +generate_response once back on top again does the transmit and +pops off. + +For rich complex output generation we might push all kinds of things +(charset converters, transfer encoders, XML -> HTML rewriters, cache +builders, old style apache module API simulators, what ever). + +The intra-stack element protocol get's interesting around issues +like error handling, blocking, etc. + +I particularly like how this allows simulation of the old module API, +as well as the API of other servers, and experimenting with other +module API which cross process or machine boundaries. + +In many ways this isn't that much different from what was proposed +a year ago. + + - ben + +************************************************************** +From: Ben Hyde +Date: Wed, 23 Sep 1998 21:58:54 -0400 (EDT) +To: new-httpd@apache.org +Subject: Re: Core server caching +In-Reply-To: +References: <19980923210119.25763.qmail@zap.ml.org> + +Message-ID: <13833.39467.942203.885143@zap.ml.org> + +Alexei Kosut writes: +>On 23 Sep 1998, Ben Hyde wrote: +> +>> The core problem of caching seems to me to get confused by the +>> complexity of designing a caching proxy. If one ignores that then the +>> core problem of caching seems quite simple. +> +>Actually, for an HTTP server, they're the same problem, if you want to be +>able to cache any sort of dynamic request. And caching static requests is +>kind of silly (Dean's flow stuff notwithstanding, making copies of static +>files in either memory or on disk is silly, since the OS can do it better +>than we can). + +I don't disagree with any of the things you said, so I guess I'm +failing to get across where in this structure the functions your +pointing out as necessary would reside as versus where the "chunk +cache" mechanism I'm yearning for would fit. + +Well, that's not entirely true I do feel it's helpful to make this +point. + +The HTTP spec's definition of proper caching is terribly constrained +by the poverty of information available to the proxy server. He is +trapped in the middle between an opinionated content provider and an +opinionated content consumer. It was written in an attempt to keep +people like AOL from making their opinions dominate either of those +other two. Proper caching by a server that is right next to the +content generation can and ought to include both more or less +heuristics that are tunable by the opinions of the content provider +who presumably we are right next to. + +Imagine the server that has a loop that goes like so: + + loop + r<-swallow_incomming_request + h<-select_response_handler(r) + initialize_response_pipeline() + push_pipeline_element(h) + tend_pipeline_until_done() + end loop + +In most of the web based applications I've seen the +select_response_handler step evolves into something that looks like an +AI expert system. That said, what I'd like to see is in Apache2 is a +simple dispatch along with a way to plug-in more complex dispatching +mechanisms. I'd very much like to avoid having that get confused with +the suite of response_handlers. + +I ignored the complexity of when to you can safely select +a cached value because I think it's in the select_response_handler +step. And possibly, I'll admit, not part of what I called the +"core server" + +Clearly I'm a fool for using this term 'core server' since it +doesn't mean anything. I wanted it to mean that loop above +and the most minimal implementations for the pipeline and +the select_response_handler one could imagine before starting +to pile on. The server as shipped would have a lot more +stuff in it! + +What I'm focused on is what has to be in that core versus +what has to be, but can be outside of it. + +So. as i thought about the state of the pipeline just after +the call on initialize_response_pipeline I at first thought +it would have something much like the current buffer abstraction +in the pipeline. Then i got to wondering if transfer encoding, +charset conversion, or caching ought to be in there. + +I think there is an argument for putting some caching functionality +in there. Possibly because that entire knot is what you'd move +into the OS if you could. Possibly because this is the bit +that must fly. + +Recall that I think the pipeline takes a stream of response +chunks with things like memory_chunk, transfer_file_chunk, etc. +in that stream. The question is what flavors of chunks does +that bottom element in the pipeline take. It's the chunks +that fly (and nothing more?). So I got to thinking about +what does it mean to have a cached_chunk. + +A cached_chunk needs only the small operation set along +the lines of what I mentioned. A full caching scheme +can build on it. As an added benefit the caching scheme +can be dumb, standard, extremely witty without effecting +this portion of the design. + +A quick point about why I wanted the cache to handle things +smaller than entire responses. This isn't central I guess. + +I want a protocol with content generators that encourages +them to use dynamic programming tricks to quickly generate +portions of pages that are static over long periods. Such +a scheme has worked well in systems we've built. + + - ben hyde + +************************************************************** +From: Ben Hyde +Date: Thu, 29 Oct 1998 23:16:37 -0500 (EST) +To: new-httpd@apache.org +Subject: Re: Core server caching +In-Reply-To: +References: + +Message-ID: <13881.12903.661334.819447@zap.ml.org> + +Dean Gaudet writes: +>On Thu, 29 Oct 1998, Rasmus Lerdorf wrote: +> +>> There are also weird and wacky things you would be able to do if you could +>> stack mod_php on top of mod_perl. +> +>You people scare me. +> +>Isn't that redundant though? +> +>Dean + +Yes it's scary, but oddly erotic, when these behemoths with their +gigantic interpreters try to mate. + +It's interesting syndrome, systems as soon as they get an interpreter +they tend to loose their bearings and grow into vast behemoths that +lumber about slowly crushing little problems with their vast mass. +Turing syndrome? + +I've heard people say modules can help avoid this, but I've rarely +seen it. Olde Unix kinda manages it remember being frightened by +awk. + +Can we nudge alloc.c/buff.c toward a bit of connective glue that +continues to let individual modules evolve their own gigantism while +avoiding vile effects on the core performance of the server? Stuff +like this: + + memory chunk alignment for optimal I/O + memory hand off along the pipeline + memory hand off crossing pool boundaries + memory hand off in zero copy cases + transmit file + transmit cache elements + insert/remove cache elements + leverage unique hardware and instructions + +That memcpy in ap_bread really bugs me. + +I'd be rather have routines that let me handoff chunks. Presumably +these would need to be able to move chunks across pool and buffer +boundaries. But zero copy if I don't touch the content and never a +memcpy just to let my lex the input. + +I've built systems like this with the buffers exposing a emacs +buffer style of abstraction, but with special kinds of marks +to denote what's released for sending, and what's been accepted +and lex'd on the input side. It does create mean all your +lexical and printf stuff has to be able to smoothly slide +over chunk boundaries. + + - ben + +************************************************************************* +Date: Sun, 27 Dec 1998 13:08:22 -0800 (PST) +From: Ed Korthof +To: new-httpd@apache.org +Subject: I/O filters & reference counts +Message-ID: + +Hi -- + +A while back, I indicated I'd propose a way to do reference counts w/ the +layered I/O I want to implement for 2.0 (assuming we don't use nspr)... +for single-threaded Apache, this seems unnecessary (assuming you don't use +shared memory in your filters to share data amoung the processes), but in +other situations it does have advantages. + +Anyway, what I'd propose involves using a special syntax when you want to +use reference counts. This allows Apache to continue using the +'pool'-based memory system (it may not be perfect, but imo it's reasonably +good), without creating difficult when you wish to free memory. + +If you're creating memory which you'll want to share amoung multiple +threads, you'll create it using a function more or less like: + + ap_palloc_share(pool *p, size_t size); + +you get back a void * pointer for use as normal. When you want to give +someone else a reference to it, you do the following: + + ap_pshare_data(pool *p1, pool *p2, void * data); + +where data is the return from above (and it must be the same). Then both +pools have a reference to the data & to a counter; when each pool is +cleaned up, it will automatically decrement the counter, and free the data +if the counter is down to zero. + +In addition, a pool can decrement the counter with the following: + + ap_pshare_free(pool * p1, void * data); + +after which the data may be freed. There would also be a function, + + ap_pshare_countrefs(pool * p1, void * data); + +which would return the number of pools holding a ref to 'data', or 1 if +it's not a shared block. + +Internally, the pool might either keep a list of the shared blocks, or a +balanced b-tree; if those are too slow, I'd look into passing back and +forth a (pointer to an) int, and simply use an array. The filter +declaring the shared memory would need to keep track of such an int, but +no one else would. + +In the context of I/O filters, this would mean that each read function +returns a const char *, which should not be cast to a non-const char * (at +least, not without calling ap_pshare_countrefs()). If a filter screwed +this up, you'd have a problem -- but that's more or less unavoidable with +sharing data amoung threads using reference counts. + +It might make sense to build a more general reference counting system; if +that's what people want, I'm also up for working on that. But one of the +advantages the pool system has is its simplicity, some of which would be +lost. + +Anyway, how does this sound? Reasonable or absurd? + +Thanks -- + +Ed + ---------------------------------------- +History repeats itself, first as tragedy, second as farce. - Karl Marx + +************************************************************************* +From: Ben Hyde +Date: Tue, 29 Dec 1998 11:50:01 -0500 (EST) +To: new-httpd@apache.org +Subject: Re: I/O filters & reference counts +In-Reply-To: +References: + +Message-ID: <13960.60942.186393.799490@zap.ml.org> + + +There are two problems that reference counts address that we have, +but I still don't like them. + +These two are: pipeline memory management, and response paste up. A +good pipeline ought not _require_ memory proportional to the size of +the response but only proportional to the diameter of the pipe. +Response paste up is interesting because the library of clip art is +longer lived than the response or connection pool. There is a lot to +be said for leveraging the configuration pool life cycle for this kind +of thing. + +The pipeline design, and the handling of the memory it uses become +very entangled after a while - I can't think about one without the +other. This is the right place to look at this problem. I.e. this +is a problem to be lead by buff.c rework, not alloc.c rework. + +Many pipeline operations require tight coupling to primitive +operations that happen to be efficient. Neat instructions, memory +mapping, etc. Extreme efficiency in this pipeline makes it desirable +that the chunks in the pipeline be large. I like the phrase "chunks +and pumps" to summarize that there are two elements to design to get +modularity right here. + +The pasteup problem - one yearns for a library of fragments (call it a +cache, clip art, or templates if you like) which then readers in that +library can assemble these into responses. Some librarians like to +discard stale bits and they need a scheme to know that the readers +have all finished. The library resides in a pool that lives longer +than a single response connection. If the librarian can be convinced +that the server restart cycles are useful we get to a fall back to +there. + +I can't smell yet where the paste up problem belong in the 2.0 design +problem. (a) in the core, (b) in a module, (c) as a subpart of the +pipeline design, or (d) ostracized outside 2.0 to await a gift (XML?) +we then fold into Apache. I could probably argue any one of these. A +good coupling between this mechanism and the pipeline is good, limits +on the pipeline design space are very good. + + - ben + + +************************************************************************* +Date: Mon, 4 Jan 1999 18:26:36 -0800 (PST) +From: Ed Korthof +To: new-httpd@apache.org +Subject: Re: I/O filters & reference counts +In-Reply-To: <13960.60942.186393.799490@zap.ml.org> +Message-ID: + +On Tue, 29 Dec 1998, Ben Hyde wrote: + +> There are two problems that reference counts address that we have, +> but I still don't like them. + +They certainly add some clutter. But they offer a solution to the +problems listed below... and specifically to an issue which you brought up +a while back: avoiding a memcpy in each read layer which has a read +function other than the default one. Sometimes a memcpy is required, +sometimes not; with "reference counts", you can go either way. + +> These two are: pipeline memory management, and response paste up. A +> good pipeline ought not _require_ memory proportional to the size of +> the response but only proportional to the diameter of the pipe. +> Response paste up is interesting because the library of clip art is +> longer lived than the response or connection pool. There is a lot to +> be said for leveraging the configuration pool life cycle for this kind +> of thing. + +I was indeed assuming that we would use pools which would last from one +restart (and a run through of the configuration functions) to the next. + +So far as limiting the memory requirements of the pipeline -- this is +primarily a function of the module programming. Because the pipeline will +generally live in a single thread (with the possible exception of the data +source, which could be another processes), the thread will only be +operating on a single filter at a time (unless you added custom code to +create a new thread to handle one part of the pipeline -- ugg). + +For writing, the idea would be to print one or more blocks of text with +each call; wait for the write function to return; and then recycle the +buffers used. + +Reading has no writev equivalent, so you only be able to do it one block +at a time, but this seems alright to me (reading data is actually a much +less complicated procedure in practice -- at least, with the applications +which I've seen). + +Recycling read buffers (so as to limit the size of the memory pipeline) +is the hardest part, when we add in this 'reference count' scheme -- but +it can be done, if the modules recieving the data are polite and indicate +when they're done with the buffer. Ie.: + + module 1 module 2 +1.) reads from module 2: + char * ap_bread(BUFF *, pool *, int); + +2.) returns a block of text w/ ref counts: + str= char* ap_pshare_alloc(size_t); + ... + return str; + keeps a ref to str. + +3.) handles the block of data + returned, and indicates it's + finished with: + void ap_pshare_free(char * block); + reads more data via + char * ap_bread(BUFF *, pool *, int); + +4.) tries to recycle the buffer used: + if (ap_pshare_count_refs(str)==1) + reuse str + else + str = ap_pshare_alloc(...) + ... + return str; + +5.) handles the block of data + returned... +... + +One disadvantage is that if module 1 doesn't release its hold on a memory +block it got from step 2 until step 5, then the memory block wouldn't be +reused -- you'd pay w/ a free & a malloc (or with a significant increase +in complexity -- I'd probably choose the free & malloc). And if the module +failed to release the memory (via ap_pshare_free), then the memory +requirements would be as large as the response (or request). + +I believe this is only relevant for clients PUTting large files onto their +servers; but w/ files which are potentially many gigabytes, it is +important that filters handling reading do this correctly. Of course, +that's currently the situation anyhow. + +> The pipeline design, and the handling of the memory it uses become +> very entangled after a while - I can't think about one without the +> other. This is the right place to look at this problem. I.e. this +> is a problem to be lead by buff.c rework, not alloc.c rework. + +Yeah, after thinking about it a little bit I realized that no (or very +little) alloc.c work would be needed to implement the system which I +described. Basically, you'd have an Apache API function which does malloc +on its own, and other functions (also in the API) which register a cleanup +function (for the malloc'ed memory) in appropriate pools. + +IMO, the 'pipeline' is likely to be the easiest place to work with this, +at least in terms of getting the most efficient & clean design which we +can. + +[snip good comments] +> I can't smell yet where the paste up problem belong in the 2.0 design +> problem. (a) in the core, (b) in a module, (c) as a subpart of the +> pipeline design, or (d) ostracized outside 2.0 to await a gift (XML?) +> we then fold into Apache. I could probably argue any one of these. A +> good coupling between this mechanism and the pipeline is good, limits +> on the pipeline design space are very good. + +An overdesigned pipeline system (or an overly large one) would definitely +not be helpful. If it would be useful, I'm happy to work on this (even if +y'all aren't sure if you'd want to use it); if not, I'm sure I can find +things to do with my time. + +Anyway, I went to CPAN and got a copy of sfio... the latest version I +found is from Oct, 1997. I'd guess that using it (assuming this is +possible) might give us slightly less efficency (simply because sfio +wasn't built specifically for Apache, and customizing it is a much more +involved processes), but possibly fewer bugs to work out & lots of +interesting features. + +thanks -- + +Ed, slowly reading through the sfio source code + Index: ossp-pkg/sio/BRAINSTORM/doc_wishes.txt RCS File: /v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/doc_wishes.txt,v co -q -kk -p'1.1' '/v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/doc_wishes.txt,v' | diff -u /dev/null - -L'ossp-pkg/sio/BRAINSTORM/doc_wishes.txt' 2>/dev/null --- ossp-pkg/sio/BRAINSTORM/doc_wishes.txt +++ - 2024-05-20 02:18:02.387347157 +0200 @@ -0,0 +1,269 @@ +Wishes -- use cases for layered IO +================================== + +[Feel free to add your own] + +Dirk's original list: +--------------------- + + This file is there so that I do not have to remind myself + about the reasons for Layered IO, apart from the obvious one. + + 0. To get away from a 1 to 1 mapping + + i.e. a single URI can cause multiple backend requests, + in arbitrary configurations, such as in paralel, tunnel/piped, + or in some sort of funnel mode. Such multiple backend + requests, with fully layered IO can be treated exactly + like any URI request; and recursion is born :-) + + 1. To do on the fly charset conversion + + Be, theoretically, be able to send out your content using + latin1, latin2 or any other charset; generated from static + _and_ dynamic content in other charsets (typically unicode + encoded as UTF7 or UTF8). Such conversion is prompted by + things like the user-agent string, a cookie, or other hints + about the capabilities of the OS, language preferences and + other (in)capabilities of the final receipient. + + 2. To be able to do fancy templates + + Have your application/cgi sending out an XML structure of + field/value pair-ed contents; which is substituted into a + template by the web server; possibly based on information + accessible/known to the webserver which you do not want to + be known to the backend script. Ideally that template would + be just as easy to generate by a backend as well (see 0). + + 3. On the fly translation + + And other general text and output mungling, such as translating + an english page in spanish whilst it goes through your Proxy, + or JPEG-ing a GIF generated by mod_perl+gd. + + Dw. + + +Dean's canonical list of use cases +---------------------------------- + +Date: Mon, 27 Mar 2000 17:37:25 -0800 (PST) +From: Dean Gaudet +To: new-httpd@apache.org +Subject: canonical list of i/o layering use cases +Message-ID: + +i really hope this helps this discussion move forward. + +the following is the list of all applications i know of which have been +proposed to benefit from i/o layering. + +- data sink abstractions: + - memory destination (for ipc; for caching; or even for abstracting + things such as strings, which can be treated as an i/o + object) + - pipe/socket destination + - portability variations on the above + +- data source abstraction, such as: + - file source (includes proxy caching) + - memory source (includes most dynamic content generation) + - network source (TCP-to-TCP proxying) + - database source (which is probably, under the covers, something like + a memory source mapped from the db process on the same box, + or from a network source on another box) + - portability variations in the above sources + +- filters: + - encryption + - translation (ebcdic, unicode) + - compression + - chunking + - MUX + - mod_include et al + +and here are some of my thoughts on trying to further quantify filters: + +a filter separates two layers and is both a sink and a source. a +filter takes an input stream of bytes OOOO... and generates an +output stream of bytes which can be broken into blocks such +as: + + OOO NNN O NNNNN ... + + where O = an old or original byte copied from the input + and N = a new byte generated by the filter + +for each filter we can calculate a quantity i'll call the copied-content +ratio, or CCR: + + nbytes_old / nbytes_new + +where: + nbytes_old = number of bytes in the output of the + filter which are copied from the input + (in zero-copy this would mean "copy by + reference counting an input buffer") + nbytes_new = number of bytes which are generated + by the filter which weren't present in the + input + +examples: + +CCR = infinity: who cares -- straight through with no + transformation. the filter shouldn't even be there. + +CCR = 0: encryption, translation (ebcdic, unicode), compression. + these get zero benefit from zero-copy. + +CCR > 0: chunking, MUX, mod_include + +from the point of view of evaluating the benefit of zero-copy we only +care about filters with CCR > 0 -- because CCR = 0 cases degenerate into +a single-copy scheme anyhow. + +it is worth noting that the large_write heuristic in BUFF fairly +clearly handles zero-copy at very little overhead for CCRs larger than +DEFAULT_BUFSIZE. + +what needs further quantification is what the CCR of mod_include would +be. + +for a particular zero-copy implementation we can find some threshold k +where filters with CCRs >= k are faster with the zero-copy implementation +and CCRs < k are slower... faster/slower as compared to a baseline +implementation such as the existing BUFF. + +it's my opinion that when you consider the data sources listed above, and +the filters listed above that *in general* the existing BUFF heuristics +are faster than a complete zero-copy implementation. + +you might ask how does this jive with published research such as the +IO-Lite stuff? well, when it comes right down to it, the research in +the IO-Lite papers deal with very large CCRs and contrast them against +a naive buffering implementation such as stdio -- they don't consider +what a few heuristics such as apache's BUFF can do. + +Dean + + +Jim's summary of a discussion +----------------------------- + + OK, so the main points we wish to address are (in no particular order): + + 1. zero-copy + 2. prevent modules/filters from having to glob the entire + data stream in order to start processing/filtering + 3. the ability to layer and "multiplex" data and meta-data + in the stream + 4. the ability to perform all HTTP processing at the + filter level (including proxy), even if not implemented in + this phase + 5. Room for optimization and recursion + + Jim Jagielski + + +Roy's ramblings +--------------- + + Data flow networks are a very well-defined and understood software + architecture. They have a single, very important constraint: no filter + is allowed to know anything about the nature of its upstream or downstream + neighbors beyond what is defined by the filter's own interface. + That constraint is what makes data flow networks highly configurable and + reusable. Those are properties that we want from our filters. + + ... + + One of the goals of the filter concept was to fix the bird's nest of + interconnected side-effect conditions that allow buff to perform well + without losing the performance. That's why there is so much trepidation + about anyone messin with 1.3.x buff. + + ... + + Content filtering is my least important goal. Completely replacing HTTP + parsing with a filter is my primary goal, followed by a better proxy, + then internal memory caches, and finally zero-copy sendfile (in order of + importance, but in reverse order of likely implementation). Content + filtering is something we get for free using the bucket brigade interface, + but we don't get anything for free if we start with an interface that only + supports content filtering. + + ... + + I don't think it is safe to implement filters in Apache without either + a smart allocation system or a strict limiting mechanism that prevents + filters from buffering more than 8KB [or user-definable amount] of memory + at a time (for the entire non-flushed stream). It isn't possible to + create a robust server implementation using filters that allocate memory + from a pool (or the heap, or a stack, or whatever) without somehow + reclaiming and reusing the memory that gets written out to the network. + There is a certain level of "optimization" that must be present before + any filtering mechanism can be in Apache, and that means meeting the + requirement that the server not keel over and die the first time a user + requests a large filtered file. XML tree manipulation is an example + where that can happen. + + ... + + Disabling content-length just because there are filters in the stream + is a blatant cop-out. If you have to do that then the design is wrong. + At the very least the HTTP filter/buff should be capable of discovering + whether it knows the content length by examing whether it has the whole + response in buffer (or fd) before it sends out the headers. + + ... + + No layered-IO solution will work with the existing memory allocation + mechanisms of Apache. The reason is simply that some filters can + incrementally process data and some filters cannot, and they often + won't know the answer until they have processed the data they are given. + This means the buffering mechanism needs some form of overflow mechanism + that diverts parts of the stream into a slower-but-larger buffer (file), + and the only clean way to do that is to have the memory allocator for the + stream also do paging to disk. You can't do this within the request pool + because each layer may need to allocate more total memory than is available + on the machine, and you can't depend on some parts of the response being + written before later parts are generated because some filtering + decisions require knowledge of the end of the stream before they + can process the beginning. + + ... + + The purpose of the filtering mechanism is to provide a useful + and easy to understand means for extending the functionality of + independent modules (filters) by rearranging them in stacks + via a uniform interface. + + +Paul J. Reder's use cases for filters +------------------------------------- + + 1) Containing only text. + 2) Containing 10 .gif or .jpg references (perhaps filtering + from one format to the other). + 3) Containing an exec of a cgi that generates a text only file + 4) Containing an exec of a cgi that generates an SSI of a text only file. + 5) Containing an exec of a cgi that generates an SSI that execs a cgi + that generates a text only file (that swallows a fly, I don't know why). + 6) Containing an SSI that execs a cgi that generates an SSI that + includes a text only file. + NOTE: Solutions must be able to handle *both* 5 and 6. Order + shouldn't matter. + 7) Containing text that must be altered via a regular expression + filter to change all occurrences of "rederpj" to "misguided" + 8) Containing text that must be altered via a regular expression + filter to change all occurrences of "rederpj" to "lost" + 9) Containing perl or php that must be handed off for processing. + 10) A page in ascii that needs to be converted to ebcdic, or from + one code page to another. + 11) Use the babelfish translation filter to translate text on a + page from Spanish to Martian-Swahili. + 12) Translate to Esperanto, compress, and encrypt the output from + a php program generated by a perl script called from a cgi exec + embedded in a file included by an SSI :) + Index: ossp-pkg/sio/BRAINSTORM/ewa.thesis.ps.L RCS File: /v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/ewa.thesis.ps.L,v co -q -kk -p'1.1' '/v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/ewa.thesis.ps.L,v' | diff -u /dev/null - -L'ossp-pkg/sio/BRAINSTORM/ewa.thesis.ps.L' 2>/dev/null --- ossp-pkg/sio/BRAINSTORM/ewa.thesis.ps.L +++ - 2024-05-20 02:18:02.389954070 +0200 @@ -0,0 +1 @@ +http://www.cs.ucsd.edu/groups/csl/pubs/conf/sosp95.html Index: ossp-pkg/sio/BRAINSTORM/ewa.thesis.ps.gz RCS File: /v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/ewa.thesis.ps.gz,v co -q -kk -p'1.1' '/v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/ewa.thesis.ps.gz,v' | diff -u /dev/null - -L'ossp-pkg/sio/BRAINSTORM/ewa.thesis.ps.gz' 2>/dev/null Binary files ossp-pkg/sio/BRAINSTORM/ewa.thesis.ps.gz and - differ Index: ossp-pkg/sio/BRAINSTORM/infocom98.ps.L RCS File: /v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/infocom98.ps.L,v co -q -kk -p'1.1' '/v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/infocom98.ps.L,v' | diff -u /dev/null - -L'ossp-pkg/sio/BRAINSTORM/infocom98.ps.L' 2>/dev/null --- ossp-pkg/sio/BRAINSTORM/infocom98.ps.L +++ - 2024-05-20 02:18:02.395313822 +0200 @@ -0,0 +1 @@ +http://www.cs.cmu.edu/~jcb/ Index: ossp-pkg/sio/BRAINSTORM/infocom98.ps.gz RCS File: /v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/infocom98.ps.gz,v co -q -kk -p'1.1' '/v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/infocom98.ps.gz,v' | diff -u /dev/null - -L'ossp-pkg/sio/BRAINSTORM/infocom98.ps.gz' 2>/dev/null Binary files ossp-pkg/sio/BRAINSTORM/infocom98.ps.gz and - differ Index: ossp-pkg/sio/BRAINSTORM/io-events.html RCS File: /v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/io-events.html,v co -q -kk -p'1.1' '/v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/io-events.html,v' | diff -u /dev/null - -L'ossp-pkg/sio/BRAINSTORM/io-events.html' 2>/dev/null --- ossp-pkg/sio/BRAINSTORM/io-events.html +++ - 2024-05-20 02:18:02.400448441 +0200 @@ -0,0 +1,354 @@ +I/O Event Handling Under Linux + +
+

I/O Event Handling Under Linux

+

Richard Gooch

+

28-JUN-1998

+
+
+ +

Introduction

+I/O Event handling is about how your Operating System allows you to +manage a large number of open files (file descriptors in UNIX/POSIX, +or FDs) in your application. You want the OS to notify you when FDs +become active (have data ready to be read or are ready for +writing). Ideally you want a mechanism that is scalable. This means a +large number of inactive FDs cost very little in memory and CPU time +to manage. +

+First I should start off by stating my goal: to develop a thin +interface that makes best use of the facilities the OS provides, +scaling as much as possible. Where possible, use of standard POSIX +facilities are preferred. A seconday goal is to provide a +convenient interface for applications. The reason for preferring +standard POSIX interfaces is that it introduces less conditional +blocks in the interface code. + +

+

The Traditional UNIX Way

+Traditional Unix systems provide the select(2) and/or +poll(2) system calls. With both of these you pass an array of +FDs to the kernel, with an optional timeout. When there is activity, +or when the call times out, the system call will return. The +application must then scan the array to see which FDs are active. This +scheme works well with small numbers of FDs, and is simple to +use. Unfortunately, for thousands of FDs, this does not work so well. + +

+The kernel has to scan your array of FDs and check which ones are +active. This takes approximately 3 microseconds (3 us) per FD on a +Pentium 100 running Linux 2.1.x. Now you might think that 3 us is +quite fast, but consider if you have an array of 1000 FDs. This is now +3 milliseconds (3 ms), which is 30% of your timeslice (each timeslice +is 10 ms). If it happens that there is initially no activity and you +specified a timeout, the kernel will have to perform a second +scan after some activity occurs or the syscall times out. Ouch! If you +have an even bigger application (like a large http server), you can +easily have 10000 FDs. Scanning times will then take 30 ms, which is +three timeslices! This is just way too much. + +

+You might say that 3 ms for 1000 FDs is not such a big deal: a user +will hardly notice that. The problem is that the entire array of FDs +is scanned each time you want to go back to your polling +loop. The way these applications work is that after checking for +activity on FDs, the application processes the activity (for example, +reading data from active FDs). When all the activity has been +processed, the application goes back to polling the OS for more FD +activity. In many cases, only a small number of FDs are active at any +one time (say during each timeslice), so it may only take a few +milliseconds to process all the activity. High performance http +servers can process hundreds or thousands of transactions per +second. A server that takes 2 ms to process each active FD can process +500 transactions per second. If you add 3 ms for FD scanning in the +kernel, you now have 5 ms per transaction. That only gives 200 +transactions per second, a massive drop in performance. + +

+There is another problem, and that is that the application needs to +scan the "returned" FD array that the kernel has updated to see which +FDs are active. This is yet another scan of a large array. This isn't +as costly as the kernel scan, for reasons I'll get to later, but it is +still a finite cost. + +

+

New POSIX Interfaces

+A fairly simple proposal is to use the POSIX.4 Asynchronous I/O (AIO) +interface (aio_read() and friends). Here we would call +aio_read() for each FD. This would then queue thousands of +asynchronous I/O requests. This model looks appealing, until we look +under the hood of some aio_*() implementations. The Linux glibc +implementation is a case in point: there is no kernel +support. Instead, the C library (glibc 2.1) launches a thread per FD +for which there are outstanding AIO requests (up to the maximum number +of configured threads). +In general, implementing this facility in the C library is reasonable, +as it avoids kernel bloat. However, if you use this facility to start +thousands of AIO requests, you may end up creating thousands of +threads. This is no good, since threads are costly. The "obvious" +solution is to implement AIO in the Linux kernel, then. Another +solution is to use userspace tricks to avoid the scalability problems +(see the description of migrating FDs below). These solutions may be +fine if you only want to run under Linux, but is not much help if you +want to run under another OS which also implements AIO using threads +(and for which you don't have the source code so you can change the +implementation). The point here is that there appears to be no +guarantee that aio_*() implementations are scalable across +platforms which support it. +

+It is also worth noting that POSIX.4 Asynchronous I/O is not +necessarily available on all POSIX.4 compliant systems (facilities +defined by POSIX.4 are optional). So even if you were prepared to +limit your application to POSIX.4 systems, there is still no guarantee +that AIO is available. Many or most implementations will be +scalable, but we can't be sure all are scalable, so we need an +alternative. +

+I should also point out that I don't think that the POSIX.4 AIO +interface is particularly appropriate for a network server +application. AIO doesn't provide a callback interface, so it is harder +to build higher-layer network connection objects in your +application. So just writing a better (scalable) implementation of +AIO is not the ideal, either. + +

+

Readiness Event Queues and other OSes

+Some other operating systems provide a mechanism called I/O completion +ports and some have event queues. These are a mechanism to tell the OS +that you want to be notified when there is activity on a FD. Usually, +when a FD becomes active (i.e. becomes ready for reading or writing) , +the OS will send a message (a "readiness event") to a "port" (perhaps +another FD such as a pipe). This message contains the FD that has +become active. The one port can have many readiness events from many +FDs sent to it. The key difference here is that you do not need to +pass the OS a massive FD array each time you want to listen for +events: you only need to tell the OS once for each FD that you want to +receive readiness events for that FD. The kernel no longer +needs to scan a massive FD array each time through your polling loop, +and nor does the application. This is an appealing approach, and +scales very well. + +

+Unfortunately, I/O readiness queues are not POSIX. They're not even +Unix. We could add them to Linux, but it would mean that applications +that relied on this mechanism would be unportable, Linux-only. Also, +it could involve significant additions to the kernel, which sets off +my bloat alarms. I would hope that it is possible to develop an +effective alternative that uses standard POSIX/UNIX functionality. + +

+

Optimising Existing UNIX Interfaces

+There are improvements we can make for the massive FD scanning +problem. Firstly we can optimise the way the scanning is done inside +the kernel. Right now (2.1.106) the kernel has to call the +poll() method for each file structure. This is expensive. Back +in the 2.1.5x kernels, I coded a + +better implementation +for the kernel which sped things up almost 3 times. While this +requires modifications to drivers to take advantage of this, it has +the advantage of not changing the semantics we expect from UNIX. Note +one other interesting feature of this optimisation: it centralises +event notification, which in turn would make implementing I/O +readiness queues simpler. I'm not sure how closure of FDs before +readiness events are read should be handled. This could complicate +their implementation. +

+Doing this optimisation does not solve our problem, though. It only +pushes the problem away for a while. + +

+

Making Better Use of Existing UNIX Interfaces

+Note that for my purposes, it is better to optimise the application so +that it works well on many OSes rather than optimising a single +OS. Creating new interfaces for Linux is a last resort. Also note that +this section assumes that an OS of interest does not have an existing +(preferably POSIX) mechanism that supports FD management in a scalable +way. +

+Another solution (which would also benefit from the kernel +optimisation discussed above) is for the application to divide the FD +array into a number of smaller FD arrays, say 10. You then create 10 +threads, each of which has a polling loop using its smaller FD +array. So each FD array is now 100 entries long. While this doesn't +change the total number of FDs that must be scanned, it does +change when they have to be scanned. Since most FDs are +inactive, not all the threads will be woken up. Too see how this +works, consider the example where, at any time (say during a single +timeslice of 10 ms), only 5 FDs are active. Assuming these FDs are +randomly, uniformly distributed, at most 5 threads will need to be +woken up. These threads then process the activity and go back to the +start of their polling loops. Where we win is that only 5 threads had +to go back and call select(2) or poll(2). Since they +each have 100 entry FD arrays, the kernel only has to scan 500 +FDs. This has halved the amount of scanning required. The scanning +load has gone from 30% to 15% by this simple change. If you were to +instead use 100 threads, you would still only have at most 5 threads +woken up for activity, and hence the total number of FDs scanned this +timeslice would be 50. This takes down the scanning load to 0.15%, +which is negligible. + +

+There is one thing to watch out for here: if you use select(2) +in your polling loop, be aware that the size of your FD array is equal +to the value of your largest FD. This is because select(2) uses +a bitmask for its FD array. This means one of your threads will want +to poll FDs 991 to 1000. Unfortunately, your FD array is still 1000 +long. What's worse, the kernel still has to do a minimal scan for all +those 1000 FDs. The solution to this is to use poll(2) instead, +where you only have to pass as many FDs as you want to poll, and the +kernel scans only those. +

+This solution sounds ideal: just create lots and lots of threads. At +the extreme, you create one thread per FD. There is a problem here, +however, as each thread consumes system resources. So you need to +compromise between the number of threads and the FD scanning load. +Also, the more threads you have the more cache misses you induce, so +this is something to avoid as well. Fortunately in this case most +threads will be running nearly the same code at the same time, so +cache pollution should not be a significant problem. +

+A more advanced solution is to have dynamic migration of FDs depending +on whether they are mostly active or inactive. In the simplest case, +you only have two threads. One which polls mostly active FDs and the +other polls mostly inactive FDs. The thread for active FDs will be +woken up very frequently, but on the other hand will have only a small +number of FDs to scan. The other thread will have to scan a large +number of FDs, but it will only be woken up occasionally. For each FD +an activity counter is kept. When a FD on the mostly inactive list is +deemed to be fairly active, it is migrated to the mostly active +list. A reverse operation occurs for fairly inactive FDs on the mostly +active list. +

+I favour this solution, since it can be implemented solely in +userspace and is portable to other POSIX systems. I have an existing +software library which +has (amongst other things) support for + +managing events on FDs. I plan on extending this library to use +the above technique. The library is distributed under the LGPL. Watch +this space for results. +

+My approach is to squeeze as much performance as we can out of the +existing POSIX/UNIX interface by optimising the kernel and doing +clever things in userspace. We can then evaluate how much of a +bottleneck polling is under Linux. If polling overheads (for very +large numbers of FDs) are kept to within a few percent, I firmly +believe there is no need for I/O readiness queues. If polling overheads +remain above 10%, then we may consider I/O readiness queues or other +extensions to Linux. However, it would be better to add kernel support +for scalable AIO rather than implement readiness queues, since AIO is +a POSIX standard whereas readiness queues are not. + +

+

Minor Additions to UNIX Interfaces

+OK, now I'll get back to another point I raised earlier, and that is +the time taken for the application to scan a list of FDs for +activity. We would like to reduce this time as well, and we can +without much effort at all. Instead of using poll(2) we can +implement poll2(2), a new syscall which is very much like +poll(2) but returns an array of active FDs. I +implemented this last year when I first started thinking about +optimising FD management under Linux. +

+So now the application only needs to search a very small array. This +new syscall is based on the principle of not duplicating work. The +kernel has just gone and scanned all these FDs for us, why not keep a +record of that work, rather than doing it again in the application? +Note that the current Linux polling implementation is so slow that +implementing poll2(2) will make little difference. However, +once the optimisations I've proposed are added, it will make +a difference (yes, I've done the benchmarks). Oh, and before you take +my comments on the Linux polling implementation too far, note that +I've benchmarked it against several other OSes, and Linux came out far +better in any case. My point is that we can still do a lot better. + +

+You might ask, why implement poll2(2) if we have this clever +userspace solution that avoids many of the problems? Well, the point +is that it would speed up the application processing the inactive list +of FDs, and that is always a good thing. However, it may turn out that +the gain from poll2(2) is marginal. You could also argue that +since poll2(2) is non-POSIX, non-UNIX, then why bother with it +at all, and why not simply implement I/O readiness queues? Well, I +could say that poll2(2) is only a small departure from +UNIX whereas I/O readiness queues are more radical. However, this +isn't a very strong argument. I'm waiting to see the results of the +userspace solution before considering poll2(2) +further. Furthermore, it would be better to make the Linux +implementation of AIO scalable, since this is a standard POSIX +interface. So poll2(2) is probably dead. It does have one nice +feature, though: it would be easier for an application to change to +using poll2(2) than to change to using AIO. + +

+Another possibility is to make a slight extension to the existing UNIX +asynchronous signal delivery mechanism. Currently, you can reqest the +OS to deliver a signal when a FD becomes ready for reading or +writing. Unfortunately, if you do this for multiple FDs your signal +handler doesn't know which FD is ready for activity. This is +because UNIX signals do not carry extra information with them. You +can't use a separate signal number for each FD, since UNIX has only a +few signal numbers. However, POSIX real-time signals can carry a word +of data to the signal handler. So we could extend Linux such that if +you request a POSIX RT signal to be delivered when an FD is ready for +I/O, the word of data is the FD number itself. Unfortunately, +depending on this behaviour would once again be Linux-specific. +However, such a system does look like an attractive way of providing +the foundations for a scalable AIO implementation. + +

+

Mixing and Matching

+A good implementation of POSIX.4 AIO should be superior to my +migrating FD scheme, since AIO should require no polling +whatsoever. Therefore my interface code should be able to make use of +AIO if available. However, since an AIO implementation may in fact not +scale well, it's performance will have to be compared to the migrating +FD scheme to determine whether or not it should be utilised. +

+Similarly, for a system without POSIX.4 AIO but with readiness queues +it would make sense for the interface code to utilise this facility. + +

+

The Thundering Herd Problem

+A note on the "thundering herd" problem: it's not really of much +interest in this discussion. The "problem" arises because people +attempt to increase concurrency of accepting new connections by having +multiple threads all blocking waiting on select(2). Because the +kernel wakes up all threads, rather than one thread per new +connection, we have a "thundering herd" of freshly woken up +threads. These problems are best solved by treating the accepting of +incoming connections as just another case of I/O management, rather +than special-casing them. + +

+

Other Resources

+Readers may find a recent paper +presented at USENIX98 of interest. The author also has a list of + +other research on this topic. The paper essentially describes a +mechanism for improving the speed of select(2) by adding +internal state information to the kernel about FD activity. + +

+

Comments Please

+I invite comments or additions to this document. My purpose is to +explain the various issues involved and serve as a primer for more +debate. I hope to avoid recurring debates on the linux-kernel list +which go over the same ground and either never contribute anything new +to the arguments, or take many messages before something new is +added. If you have a strong technical argument that I've missed, I'll +be happy to add it to this document. + +
+Original: 22-JUN-1998 +
+Back to my Home Page +
+
+
Richard Gooch (rgooch@atnf.csiro.au)
+
+ + Index: ossp-pkg/sio/BRAINSTORM/io-events.html.L RCS File: /v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/io-events.html.L,v co -q -kk -p'1.1' '/v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/io-events.html.L,v' | diff -u /dev/null - -L'ossp-pkg/sio/BRAINSTORM/io-events.html.L' 2>/dev/null --- ossp-pkg/sio/BRAINSTORM/io-events.html.L +++ - 2024-05-20 02:18:02.403180228 +0200 @@ -0,0 +1 @@ +http://www.atnf.csiro.au/~rgooch/linux/docs/io-events.html Index: ossp-pkg/sio/BRAINSTORM/io-events.ps.gz RCS File: /v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/io-events.ps.gz,v co -q -kk -p'1.1' '/v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/io-events.ps.gz,v' | diff -u /dev/null - -L'ossp-pkg/sio/BRAINSTORM/io-events.ps.gz' 2>/dev/null Binary files ossp-pkg/sio/BRAINSTORM/io-events.ps.gz and - differ Index: ossp-pkg/sio/BRAINSTORM/iolib-eval-rse.txt RCS File: /v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/iolib-eval-rse.txt,v co -q -kk -p'1.1' '/v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/iolib-eval-rse.txt,v' | diff -u /dev/null - -L'ossp-pkg/sio/BRAINSTORM/iolib-eval-rse.txt' 2>/dev/null --- ossp-pkg/sio/BRAINSTORM/iolib-eval-rse.txt +++ - 2024-05-20 02:18:02.408141763 +0200 @@ -0,0 +1,28 @@ + +stdio - 1990-1991 Steve Summit + reimplementated stdio, user-definable underlying I/O functions, improved + error handling, new routines for "string" I/O, and efficient unbuffered + I/O + +sfio - 1991-1998 David Korn and Kiem-Phong Vo, All rights reserved." + beherscht alle features, schwaches layering, portable, fast + +sio - 1993 Panos Tsirigotis + kein socket support aber mmap + +bstdio - 1993-1994 Chris Provenzano + threadsafe stdio based on the BSD stdio, uses pthreads + +buff - 1995-1999 Robert S Thau, Apache Group + Apache Buffer and I/O library + +substdio - Dan Berstein + - minimalistic stdio + +bio - 199x-1999 Eric A. Young + BIO, sockets, memory, stacking + +rt-aio - 1997 Ulrich Drepper + pthread basierte Async IO von glibc + + Index: ossp-pkg/sio/BRAINSTORM/osdi96.ps.L RCS File: /v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/osdi96.ps.L,v co -q -kk -p'1.1' '/v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/osdi96.ps.L,v' | diff -u /dev/null - -L'ossp-pkg/sio/BRAINSTORM/osdi96.ps.L' 2>/dev/null --- ossp-pkg/sio/BRAINSTORM/osdi96.ps.L +++ - 2024-05-20 02:18:02.412207595 +0200 @@ -0,0 +1 @@ +http://www.cs.cmu.edu/~jcb/ Index: ossp-pkg/sio/BRAINSTORM/osdi96.ps.gz RCS File: /v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/osdi96.ps.gz,v co -q -kk -p'1.1' '/v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/osdi96.ps.gz,v' | diff -u /dev/null - -L'ossp-pkg/sio/BRAINSTORM/osdi96.ps.gz' 2>/dev/null Binary files ossp-pkg/sio/BRAINSTORM/osdi96.ps.gz and - differ Index: ossp-pkg/sio/BRAINSTORM/smli_tr-95-39.L RCS File: /v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/smli_tr-95-39.L,v co -q -kk -p'1.1' '/v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/smli_tr-95-39.L,v' | diff -u /dev/null - -L'ossp-pkg/sio/BRAINSTORM/smli_tr-95-39.L' 2>/dev/null --- ossp-pkg/sio/BRAINSTORM/smli_tr-95-39.L +++ - 2024-05-20 02:18:02.420196078 +0200 @@ -0,0 +1,2 @@ +http://www.smli.com/technical-reports/1995/abstract-39.html +http://www.smli.com/technical-reports/1995/1995.html Index: ossp-pkg/sio/BRAINSTORM/smli_tr-95-39.pdf RCS File: /v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/smli_tr-95-39.pdf,v co -q -kk -p'1.1' '/v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/smli_tr-95-39.pdf,v' | diff -u /dev/null - -L'ossp-pkg/sio/BRAINSTORM/smli_tr-95-39.pdf' 2>/dev/null Binary files ossp-pkg/sio/BRAINSTORM/smli_tr-95-39.pdf and - differ Index: ossp-pkg/sio/BRAINSTORM/smli_tr-95-39.ps.gz RCS File: /v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/smli_tr-95-39.ps.gz,v co -q -kk -p'1.1' '/v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/smli_tr-95-39.ps.gz,v' | diff -u /dev/null - -L'ossp-pkg/sio/BRAINSTORM/smli_tr-95-39.ps.gz' 2>/dev/null Binary files ossp-pkg/sio/BRAINSTORM/smli_tr-95-39.ps.gz and - differ Index: ossp-pkg/sio/BRAINSTORM/smli_tr-99-76.ps RCS File: /v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/smli_tr-99-76.ps,v co -q -kk -p'1.1' '/v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/smli_tr-99-76.ps,v' | diff -u /dev/null - -L'ossp-pkg/sio/BRAINSTORM/smli_tr-99-76.ps' 2>/dev/null --- ossp-pkg/sio/BRAINSTORM/smli_tr-99-76.ps +++ - 2024-05-20 02:18:02.428332079 +0200 @@ -0,0 +1,8960 @@ +%!PS-Adobe-3.0 +%%Title: (Finished.pdf) +%%Version: 1 2 +%%Creator: (FrameMaker xm5.5.3L15a) +%%CreationDate: (D:19990427163055) +%%DocumentData: Clean7Bit +%%LanguageLevel: 2 +%%BoundingBox: 0 0 612 792 +%%Pages: 30 +%%DocumentProcessColors: Cyan Magenta Yellow Black +%%DocumentSuppliedResources: +%%+ font Times-Italic +%%+ font Times-Bold +%%+ font Courier +%%+ font Symbol +%%+ font Courier-Bold +%%+ font Helvetica +%%+ font Times-Roman +%%+ font Helvetica-Bold +%%+ font Times-Roman +%%+ procset (Adobe Acrobat - PDF operators) 1.2 0 +%%+ procset (Adobe Acrobat - type operators) 1.2 0 +%%EndComments +%%BeginDefaults +%%EndDefaults +%%BeginProlog +%%EndProlog +%%BeginSetup +%%BeginFile: l2check +%%Copyright: Copyright 1993 Adobe Systems Incorporated. All Rights Reserved. +/languagelevel where +{ pop languagelevel 1 eq } +{ true } +ifelse +{ +initgraphics /Helvetica findfont 18 scalefont setfont +72 600 moveto (Error: Your printer driver needs to be configured) dup show +72 580 moveto (for printing to a PostScript Level 1 printer.) dup show +exch = = +/Helvetica-Bold findfont 16 scalefont setfont +72 520 moveto (Windows and Unix) show +/Times-Roman findfont 16 scalefont setfont +72 500 moveto (Select ªLevel 1º in the PostScript options section) show +72 480 moveto (of the Acrobat Exchange or Reader print dialog.) show +/Helvetica-Bold findfont 16 scalefont setfont +72 440 moveto (Macintosh) show +/Times-Roman findfont 16 scalefont setfont +72 420 moveto (In the Chooser, select your printer driver.) show +72 400 moveto (Then select your printer and click the Setup button.) show +72 380 moveto (Follow any on-screen dialogs that may appear.) show +showpage +quit +} +if +%%EndFile +/currentpacking where{pop currentpacking true setpacking}if +userdict /PDF 85 dict put +%%BeginFile: pdfvars.prc +%%Copyright: Copyright 1987-1996 Adobe Systems Incorporated. All Rights Reserved. +userdict /PDFVars 75 dict dup begin put +/_save 0 def +/_cshow 0 def +/InitAll 0 def +/TermAll 0 def +/_lp /none def +/_doClip 0 def +/sfc 0 def +/_sfcs 0 def +/_sfc 0 def +/ssc 0 def +/_sscs 0 def +/_ssc 0 def +/_fcs 0 def +/_scs 0 def +/_fp 0 def +/_sp 0 def +/_f0 0 array def +/_f1 1 array def +/_f3 3 array def +/_f4 4 array def +/_fc null def +/_s0 0 array def +/_s1 1 array def +/_s3 3 array def +/_s4 4 array def +/_sc null def +/_cpcf null def +/_cpcs null def +/_inT false def +/_tr -1 def +/_rise 0 def +/_ax 0 def +/_cx 0 def +/_ld 0 def +/_tm matrix def +/_ctm matrix def +/_mtx matrix def +/_hy (-) def +/_fScl 0 def +/_hs 1 def +/_pdfEncodings 2 array def +/_Tj 0 def +/_italMtx[1 0 .212557 1 0 0]def +/_basefont 0 def +/_basefonto 0 def +/_categories 10 dict def +/_sa? true def +/_op? false def +/_ColorSep5044? false def +/_tmpcolr? [] def +/_tmpop? {} def +end +%%EndFile +PDFVars begin PDF begin +%%BeginFile: pdfutil.prc +%%Copyright: Copyright 1993 Adobe Systems Incorporated. All Rights Reserved. +/bd {bind def} bind def +/ld {load def} bd +/dd { PDFVars 3 1 roll put } bd +/xdd { exch dd } bd +/Level2? +/languagelevel where { pop languagelevel 2 ge } { false } ifelse +def +/here { +dup currentdict exch known +{ currentdict exch get true } +{ pop false } +ifelse +} bd +/isdefined? { where { pop true } { false } ifelse } bd +/StartLoad { dup dup not { /_save save dd } if } bd +/EndLoad { if not { _save restore } if } bd +/npop { { pop } repeat } bd +%%EndFile +%%BeginFile: pdf.prc +%%Copyright: Copyright 1987-1996 Adobe Systems Incorporated. All Rights Reserved. +/initialize { +_ColorSep5044? {sep_ops begin 50 dict begin} if +newpath +} bd +/terminate { +_ColorSep5044? {end end} if +} bd +Level2? StartLoad +{ /m/moveto ld +/l/lineto ld +/c/curveto ld +/setSA/setstrokeadjust ld +} EndLoad +Level2? not StartLoad +{ +/pl { +transform +0.25 sub round 0.25 add exch +0.25 sub round 0.25 add exch +itransform +} bd +/m { _sa? { pl } if moveto } bd +/l { _sa? { pl } if lineto } bd +/c { _sa? { pl } if curveto } bd +/setSA { /_sa? xdd } bd +} EndLoad +/v { currentpoint 6 2 roll c } bd +/y { 2 copy c } bd +/h/closepath ld +/d/setdash ld +/j/setlinejoin ld +/J/setlinecap ld +/M/setmiterlimit ld +/w/setlinewidth ld +/cf currentflat def +/i { +dup 0 eq { pop cf } if +setflat +} bd +/ilp { /_lp /none dd } bd +/sfc { +_lp /fill ne { +_sfcs +_sfc +/_lp /fill dd +} if +} dd +/ssc { +_lp /stroke ne { +_sscs +_ssc +/_lp /stroke dd +} if +} dd +/n { +_doClip 1 ge { +_doClip 1 eq { clip } { eoclip } ifelse +/_doClip 0 dd +} if +newpath +} bd +/f { +_doClip 1 ge +{ +gsave sfc fill grestore +_doClip 1 eq { clip } { eoclip } ifelse +newpath +ilp +/_doClip 0 dd +} +{ sfc fill } +ifelse +} bd +/f* { +_doClip 1 ge +{ +gsave sfc eofill grestore +_doClip 1 eq { clip } { eoclip } ifelse +newpath +ilp +/_doClip 0 dd +} +{ sfc eofill } +ifelse +} bd +/S { +_doClip 1 ge +{ +gsave ssc stroke grestore +_doClip 1 eq { clip } { eoclip } ifelse +newpath +ilp +/_doClip 0 dd +} +{ ssc stroke } +ifelse +} bd +/s { h S } bd +/B { +_doClip dup 1 ge +gsave f grestore +{ +gsave S grestore +1 eq { clip } { eoclip } ifelse +newpath +ilp +/_doClip 0 dd +} +{ pop S } +ifelse +} bd +/b { h B } bd +/B* { +_doClip dup 1 ge +gsave f* grestore +{ +gsave S grestore +1 eq { clip } { eoclip } ifelse +newpath +ilp +/_doClip 0 dd +} +{ pop S } +ifelse +} bd +/b* { h B* } bd +/W { /_doClip 1 dd } bd +/W* { /_doClip 2 dd } bd +/q/save ld +/Q { restore ilp } bd +Level2? StartLoad +{ /defineRes/defineresource ld +/findRes/findresource ld +currentglobal +true systemdict /setglobal get exec +[/Function /ExtGState /Form] +{ /Generic /Category findresource dup length dict copy /Category defineresource pop } +forall +systemdict /setglobal get exec +} EndLoad +Level2? not StartLoad +{ /AlmostFull? +{ dup maxlength exch length sub 2 le +} bind def +/Expand +{ 1 index maxlength mul cvi dict +dup begin exch { def } forall end +} bind def +/xput +{ 3 2 roll +dup 3 index known not +{ dup AlmostFull? { 1.5 Expand } if +} if +dup 4 2 roll put +} bind def +/defineRes +{ _categories 1 index known not +{ /_categories _categories 2 index 10 dict xput store +} if +_categories exch 2 copy get 5 -1 roll 4 index xput put +} bind def +/findRes +{ _categories exch get exch get +} bind def +} EndLoad +/cs +{ +dup where { pop load } if +dup /_fcs xdd +ucs +_cpcf exch get +/_fc xdd +/_fp null dd +} bd +/CS +{ +dup where { pop load } if +dup /_scs xdd ucs _cpcs exch get /_sc xdd /_sp null dd +} bd +/ucs { +dup type /arraytype eq +{ dup 0 get +dup /Indexed eq +{ pop 0 get } +{ /Pattern eq +{ dup length 2 eq +{ 1 get ucs } +{ 0 get } +ifelse } +{ 0 get } +ifelse } +ifelse } +if } +bd +/_cpcf +15 dict dup begin +/DefaultGray _f1 def +/DeviceGray _f1 def +/DefaultRGB _f3 def +/DeviceRGB _f3 def +/DeviceCMYK _f4 def +/CalGray _f1 def +/CalRGB _f3 def +/CalCMYK _f4 def +/Lab _f3 def +/Pattern _f0 def +/Indexed _f1 def +/Separation _f1 def +/CIEBasedA _f1 def +/CIEBasedABC _f3 def +end +dd +/_cpcs +15 dict dup begin +/DefaultGray _s1 def +/DeviceGray _s1 def +/DefaultRGB _s3 def +/DeviceRGB _s3 def +/DeviceCMYK _s4 def +/CalGray _s1 def +/CalRGB _s3 def +/CalCMYK _s4 def +/Lab _s3 def +/Pattern _s0 def +/Indexed _s1 def +/Separation _s1 def +/CIEBasedA _s1 def +/CIEBasedABC _s3 def +end +dd +Level2? not StartLoad { +/ri/pop ld +/makePat/pop ld +} EndLoad +Level2? StartLoad { +/ri +{ +/findcolorrendering isdefined? +{ +mark exch +findcolorrendering +counttomark 2 eq +{ type /booleantype eq +{ dup type /nametype eq +{ dup /ColorRendering resourcestatus +{ pop pop +dup /DefaultColorRendering ne +{ +/ColorRendering findresource +setcolorrendering +} if +} if +} if +} if +} if +cleartomark +} +{ pop +} ifelse +} bd +/makePat /makepattern ld +} EndLoad +Level2? not _ColorSep5044? or StartLoad +{ +/L1setcolor { +aload length +dup 0 eq +{ pop .5 setgray } +{ dup 1 eq +{ pop setgray } +{ 3 eq +{ setrgbcolor } +{ setcmykcolor } +ifelse } +ifelse } +ifelse +} bind dd +/_sfcs { } dd +/_sscs { } dd +} EndLoad +Level2? not _ColorSep5044? not and StartLoad +{ +/_sfc { _fc L1setcolor } dd +/_ssc { _sc L1setcolor } dd +} EndLoad +Level2? _ColorSep5044? not and StartLoad +{ +/_sfcs +{ +_fcs setcolorspace +} bind dd +/_sscs +{ +_scs setcolorspace +} bind dd +/_sfc +{ +_fc aload pop +_fp null eq +{ setcolor } +{ _fp setpattern } +ifelse +} bind dd +/_ssc +{ +_sc aload pop +_sp null eq { setcolor } { _sp setpattern } ifelse +} bind dd +} EndLoad +/sc +{ +_fc astore pop +ilp +} bd +/SC +{ +_sc astore pop +ilp +} bd +/scn { +dup type /dicttype eq +{ dup /_fp xdd +/PaintType get 1 eq +{ /_fc _f0 dd ilp } +{ /_fc _cpcf _fcs ucs get dd +sc } +ifelse } +{ sc } +ifelse +} bd +/SCN { +dup type /dicttype eq +{ dup /_sp xdd +/PaintType get 1 eq +{ /_sc _s0 dd ilp } +{ /_sc _cpcs _scs ucs get dd +SC } +ifelse } +{ SC } +ifelse +} bd +/g { /DefaultGray cs sc } bd +/rg { /DefaultRGB cs sc } bd +/k { /DeviceCMYK cs sc } bd +/G { /DefaultGray CS SC } bd +/RG { /DefaultRGB CS SC } bd +/K { /DeviceCMYK CS SC } bd +/cm { _mtx astore concat } bd +/re { +4 2 roll m +1 index 0 rlineto +0 exch rlineto +neg 0 rlineto +h +} bd +/RC/rectclip ld +/EF/execform ld +/PS { cvx exec } bd +/initgs { +/DefaultGray where +{ pop } +{ /DefaultGray /DeviceGray dd } +ifelse +/DefaultRGB where +{ pop } +{ /DefaultRGB /DeviceRGB dd } +ifelse +0 g 0 G +[] 0 d 0 j 0 J 10 M 1 w +true setSA +} bd +21 dict dup begin +/CosineDot +{ 180 mul cos exch 180 mul cos add 2 div } bd +/Cross +{ abs exch abs 2 copy gt { exch } if pop neg } bd +/Diamond +{ abs exch abs 2 copy add .75 le +{ dup mul exch dup mul add 1 exch sub } +{ 2 copy add 1.23 le +{ .85 mul add 1 exch sub } +{ 1 sub dup mul exch 1 sub dup mul add 1 sub } +ifelse } +ifelse } bd +/Double +{ exch 2 div exch 2 { 360 mul sin 2 div exch } repeat add } bd +/DoubleDot +{ 2 { 360 mul sin 2 div exch } repeat add } bd +/Ellipse +{ abs exch abs 2 copy 3 mul exch 4 mul add 3 sub dup 0 lt +{ pop dup mul exch .75 div dup mul add 4 div +1 exch sub } +{ dup 1 gt +{pop 1 exch sub dup mul exch 1 exch sub +.75 div dup mul add 4 div 1 sub } +{ .5 exch sub exch pop exch pop } +ifelse } +ifelse } bd +/EllipseA +{ dup mul .9 mul exch dup mul add 1 exch sub } bd +/EllipseB +{ dup 5 mul 8 div mul exch dup mul exch add sqrt 1 exch sub } bd +/EllipseC +{ dup .5 gt { 1 exch sub } if +dup .25 ge +{ .5 exch sub 4 mul dup mul 1 sub } +{ 4 mul dup mul 1 exch sub } +ifelse +exch +dup .5 gt { 1 exch sub } if +dup .25 ge +{ .5 exch sub 4 mul dup mul 1 sub } +{ 4 mul dup mul 1 exch sub } +ifelse +add -2 div } bd +/InvertedDouble +{ exch 2 div exch 2 { 360 mul sin 2 div exch } repeat add neg } bd +/InvertedDoubleDot +{ 2 { 360 mul sin 2 div exch } repeat add neg } bd +/InvertedEllipseA +{ dup mul .9 mul exch dup mul add 1 sub } bd +/InvertedSimpleDot +{ dup mul exch dup mul add 1 sub } bd +/Line +{ exch pop abs neg } bd +/LineX +{ pop } bd +/LineY +{ exch pop } bd +/Rhomboid +{ abs exch abs 0.9 mul add 2 div } bd +/Round +{ abs exch abs 2 copy add 1 le +{ dup mul exch dup mul add 1 exch sub } +{ 1 sub dup mul exch 1 sub dup mul add 1 sub } +ifelse } bd +/SimpleDot +{ dup mul exch dup mul add 1 exch sub } bd +/Square +{ abs exch abs 2 copy lt { exch } if pop neg } bd +end +{ /Function defineRes pop } forall +/Identity {} /Function defineRes pop +Level2? StartLoad { +/gs +{ +begin +/SA here { setstrokeadjust } if +/OP here { setoverprint } if +/BG here { setblackgeneration } if +/UCR here { setundercolorremoval } if +/HT here { sethalftone } if +/sethalftonephase isdefined? { /HTP here { sethalftonephase } if } if +/TR here +{ +dup xcheck { settransfer } { aload pop setcolortransfer } ifelse +} if +end +} bd +{ /Default /Halftone findresource pop } stopped +{ +currenthalftone exch defineresource pop } +if +} EndLoad +Level2? not StartLoad { +/gs +{ +begin +/SA here { /_sa? xdd } if +/OP here { dup /_op? xdd +/setoverprint where {pop setoverprint} +{pop} ifelse +} if +/HT here { sethalftone } if +/TR here { dup xcheck +{ settransfer } +{ pop } +ifelse } if +end +} bd +5 dict dup +begin +currentscreen 1 [/HalftoneType /SpotFunction /Angle /Frequency] +{ exch def } forall +end +/Default exch /Halftone defineRes pop +} EndLoad +/int { +dup 2 index sub 3 index 5 index sub div 6 -2 roll sub mul +exch pop add exch pop +} bd +/limit { +dup 2 index le { exch } if pop +dup 2 index ge { exch } if pop +} bd +_ColorSep5044? StartLoad { +/_sfc +{ +_fp null eq +{ _fcs type /arraytype eq +{_fcs 0 get /Separation eq +{ +_fcs 1 get /All eq +{ +_fc aload pop 1 exch sub +/setseparationgray where pop begin setseparationgray end +} +{ +1 _fcs 3 get exec _fcs 1 get +/findcmykcustomcolor where pop begin findcmykcustomcolor end +_fc aload pop +/setcustomcolor where pop begin setcustomcolor end +} +ifelse +} +{ _fc L1setcolor } +ifelse +} +{ _fc L1setcolor } +ifelse +} +{ _fp setpattern } +ifelse +} bind dd +/_ssc +{ +_sp null eq +{ _scs type /arraytype eq +{_scs 0 get /Separation eq +{ +_scs 1 get /All eq +{ +_sc aload pop 1 exch sub +/setseparationgray where pop begin setseparationgray end +} +{ +1 _scs 3 get exec _scs 1 get +/findcmykcustomcolor where pop begin findcmykcustomcolor end +_sc aload pop +/setcustomcolor where pop begin setcustomcolor end +} +ifelse +} +{ _sc L1setcolor } +ifelse +} +{ _sc L1setcolor } +ifelse +} +{ _sp setpattern } +ifelse +} bind dd +} EndLoad +%%EndFile +%%BeginFile: pdftext.prc +%%Copyright: Copyright 1987-1994 Adobe Systems Incorporated. All Rights Reserved. +PDF /PDFText 51 dict dup begin put +/initialize { PDFText begin } bd +/terminate { end } bd +/CopyFont { +{ +1 index /FID ne 2 index /UniqueID ne and +{ def } { pop pop } ifelse +} forall +} bd +/modEnc { +/_enc xdd +/_icode 0 dd +counttomark 1 sub -1 0 +{ +index +dup type /nametype eq +{ +_enc _icode 3 -1 roll put +_icode 1 add +} +if +/_icode xdd +} for +cleartomark +_enc +} bd +/trEnc { +/_enc xdd +255 -1 0 { +exch dup -1 eq +{ pop /.notdef } +{ Encoding exch get } +ifelse +_enc 3 1 roll put +} for +pop +_enc +} bd +/TE { +/_i xdd +StandardEncoding 256 array copy modEnc +_pdfEncodings exch _i exch put +} bd +/TZ +{ +/_usePDFEncoding xdd +findfont +dup length 2 add dict +begin +{ +1 index /FID ne { def } { pop pop } ifelse +} forall +/FontName exch def +_usePDFEncoding 0 ge +{ +/Encoding _pdfEncodings _usePDFEncoding get def +pop +} +{ +_usePDFEncoding -1 eq +{ +counttomark 0 eq +{ pop } +{ +Encoding 256 array copy +modEnc /Encoding exch def +} +ifelse +} +{ +256 array +trEnc /Encoding exch def +} +ifelse +} +ifelse +FontName currentdict +end +definefont pop +} +bd +/_pdfIsLevel2 +systemdict /languagelevel known +{languagelevel 2 ge} +{false} +ifelse +def +_pdfIsLevel2 +{ +/_pdfFontStatus +{ +dup /Font resourcestatus +{pop pop pop true} +{ +/CIDFont /Category resourcestatus +{ +pop pop +/CIDFont resourcestatus +{pop pop true} +{false} +ifelse +} +{ pop false } +ifelse +} +ifelse +} bd +} +{ +/_pdfFontStatusString 50 string def +_pdfFontStatusString 0 (fonts/) putinterval +/_pdfFontStatus +{ +_pdfFontStatusString 6 42 getinterval +cvs length 6 add +_pdfFontStatusString exch 0 exch getinterval +status +{ pop pop pop pop true} +{ false } +ifelse +} bd +} +ifelse +/_pdfString100 100 string def +/_pdfComposeFontName +{ +dup length 1 eq +{ +0 get +1 index +type /nametype eq +{ +_pdfString100 cvs +length dup dup _pdfString100 exch (-) putinterval +_pdfString100 exch 1 add dup _pdfString100 length exch sub getinterval +2 index exch cvs length +add 1 add _pdfString100 exch 0 exch getinterval +exch pop +true +} +{ +pop pop +false +} +ifelse +} +{ +false +} +ifelse +} bd +systemdict /composefont known +{ +/_pdfComposeFont +{ +1 index /CMap resourcestatus +{pop pop true} +{false} +ifelse +1 index true exch +{ +_pdfFontStatus not +{pop false exit} +if +} +forall +and +{composefont true} +{ +_pdfComposeFontName +{ +dup _pdfFontStatus +{ findfont definefont true } +{ pop pop false } +ifelse +} +{ +dup _pdfFontStatus +{ findfont true } +{ pop false } +ifelse +} +ifelse +} +ifelse +} bd +} +{ +/_pdfComposeFont +{ +_pdfComposeFontName not +{ +dup +} +if +2 copy _pdfFontStatus +{ pop findfont definefont true } +{ +eq +{pop false} +{ +dup _pdfFontStatus +{findfont true} +{pop false} +ifelse +} +ifelse +} +ifelse +} bd +} +ifelse +/_pdfFaceByStyleDict 4 dict dup begin +_pdfIsLevel2 +{ +/Serif +/Ryumin-Light-83pv-RKSJ-H /Font resourcestatus +{pop pop /Ryumin-Light} +{/HeiseiMin-W3} +ifelse +def +/SansSerif +/GothicBBB-Medium-83pv-RKSJ-H /Font resourcestatus +{pop pop /GothicBBB-Medium} +{/HeiseiKakuGo-W5} +ifelse +def +/Jun101-Light-83pv-RKSJ-H /Font resourcestatus +{pop pop /RoundSansSerif /Jun101-Light def } +{ +/HeiseiMaruGo-W4-83pv-RKSJ-H /Font resourcestatus +{pop pop /RoundSansSerif /HeiseiMaruGo-W4 def} +if +} +ifelse +/Default Serif def +} +{ +/Serif /Ryumin-Light def +/SansSerif /GothicBBB-Medium def +{ +(fonts/Jun101-Light-83pv-RKSJ-H) status +}stopped +{pop}{ +{pop pop pop pop /RoundSansSerif /Jun101-Light def } +if +}ifelse +/Default Serif def +} +ifelse +end +def +/TZzero +{ +/_styleArr xdd +3 copy +_pdfComposeFont +{exch pop exch pop exch pop} +{ +[ +0 1 _styleArr length 1 sub +{ +_styleArr exch get +_pdfFaceByStyleDict exch 2 copy known not +{ pop /Default } +if +get +} +for +] +exch pop +2 index 3 1 roll +_pdfComposeFont +{exch pop} +{ +findfont +dup length dict +begin +{1 index /FID ne {def}{pop pop} ifelse } +forall +currentdict +end +} +ifelse +} +ifelse +definefont pop +} +bd +/swj { +dup 4 1 roll +dup length exch stringwidth +exch 5 -1 roll 3 index mul add +4 1 roll 3 1 roll mul add +6 2 roll /_cnt 0 dd +{1 index eq {/_cnt _cnt 1 add dd} if} forall pop +exch _cnt mul exch _cnt mul 2 index add 4 1 roll 2 index add 4 1 roll pop pop +} bd +/jss { +4 1 roll +{ +2 npop +(0) exch 2 copy 0 exch put +gsave +32 eq +{ +exch 6 index 6 index 6 index 5 -1 roll widthshow +currentpoint +} +{ +false charpath currentpoint +4 index setmatrix stroke +} +ifelse +grestore +moveto +2 copy rmoveto +} exch cshow +6 npop +} def +/jsp +{ +{ +2 npop +(0) exch 2 copy 0 exch put +32 eq +{ exch 5 index 5 index 5 index 5 -1 roll widthshow } +{ false charpath } +ifelse +2 copy rmoveto +} exch cshow +5 npop +} bd +/trj { _cx 0 32 _ax 0 6 5 roll } bd +/pjsf { trj sfc awidthshow } bd +/pjss { trj _ctm ssc jss } bd +/pjsc { trj jsp } bd +/_Tjdef [ +/pjsf load +/pjss load +{ +dup +currentpoint 3 2 roll +pjsf +newpath moveto +pjss +} bind +{ +trj swj rmoveto +} bind +{ +dup currentpoint 4 2 roll gsave +pjsf +grestore 3 1 roll moveto +pjsc +} bind +{ +dup currentpoint 4 2 roll +currentpoint gsave newpath moveto +pjss +grestore 3 1 roll moveto +pjsc +} bind +{ +dup currentpoint 4 2 roll gsave +dup currentpoint 3 2 roll +pjsf +newpath moveto +pjss +grestore 3 1 roll moveto +pjsc +} bind +/pjsc load +] def +/BT +{ +/_inT true dd +_ctm currentmatrix pop matrix _tm copy pop +0 _rise translate _hs 1 scale +0 0 moveto +} bd +/ET +{ +/_inT false dd +_tr 3 gt {clip} if +_ctm setmatrix newpath +} bd +/Tr { +_inT { _tr 3 le {currentpoint newpath moveto} if } if +dup /_tr xdd +_Tjdef exch get /_Tj xdd +} bd +/Tj { +userdict /$$copystring 2 index put +_Tj +} bd +/iTm { _ctm setmatrix _tm concat 0 _rise translate _hs 1 scale } bd +/Tm { _tm astore pop iTm 0 0 moveto } bd +/Td { _mtx translate _tm _tm concatmatrix pop iTm 0 0 moveto } bd +/TD { dup /_ld xdd Td } bd +/Tf { +dup 1000 div /_fScl xdd +exch findfont exch scalefont setfont +} bd +/TL { neg /_ld xdd } bd +/Tw { /_cx xdd } bd +/Tc { /_ax xdd } bd +/Ts { /_rise xdd currentpoint iTm moveto } bd +/Tz { 100 div /_hs xdd iTm } bd +/Tk { exch pop _fScl mul neg 0 rmoveto } bd +/T* { 0 _ld Td } bd +/' { T* Tj } bd +/" { exch Tc exch Tw ' } bd +/TJ { +{ +dup type /stringtype eq +{ Tj } +{ 0 exch Tk } +ifelse +} forall +} bd +/T- { _hy Tj } bd +/d0/setcharwidth ld +/d1 { setcachedevice /sfc{}dd /ssc{}dd } bd +/nND {{/.notdef} repeat} bd +/T3Defs { +/BuildChar +{ +1 index /Encoding get exch get +1 index /BuildGlyph get exec +} +def +/BuildGlyph { +exch begin +GlyphProcs exch get exec +end +} def +} bd +/MakeBold { +findfont dup dup length 2 add dict +begin +CopyFont +/PaintType 2 def +/StrokeWidth .03 0 FontMatrix idtransform pop def +/dummybold currentdict +end +definefont +8 dict begin +/_basefont exch def +/_basefonto exch def +/FontType 3 def +/FontMatrix[1 0 0 1 0 0]def +/FontBBox[0 0 1 1]def +/Encoding StandardEncoding def +/BuildChar +{ +exch begin +_basefont setfont +( )dup 0 4 -1 roll put +dup stringwidth +1 index 0 ne { exch .03 add exch }if +setcharwidth +0 0 moveto +gsave +dup show +grestore +_basefonto setfont +show +end +}bd +currentdict +end +definefont pop +}bd +/MakeItalic { +findfont _italMtx makefont +dup length dict +begin +CopyFont +currentdict +end +definefont pop +}bd +/MakeBoldItalic { +/dummybold exch +MakeBold +/dummybold +MakeItalic +}bd +currentdict readonly pop end +%%EndFile +PDFText begin +[39/quotesingle 96/grave 128/Adieresis/Aring/Ccedilla/Eacute/Ntilde/Odieresis +/Udieresis/aacute/agrave/acircumflex/adieresis/atilde/aring/ccedilla/eacute +/egrave/ecircumflex/edieresis/iacute/igrave/icircumflex/idieresis/ntilde +/oacute/ograve/ocircumflex/odieresis/otilde/uacute/ugrave/ucircumflex +/udieresis/dagger/degree/cent/sterling/section/bullet/paragraph/germandbls +/registered/copyright/trademark/acute/dieresis/.notdef/AE/Oslash +/.notdef/plusminus/.notdef/.notdef/yen/mu/.notdef/.notdef +/.notdef/.notdef/.notdef/ordfeminine/ordmasculine/.notdef/ae/oslash +/questiondown/exclamdown/logicalnot/.notdef/florin/.notdef/.notdef +/guillemotleft/guillemotright/ellipsis/.notdef/Agrave/Atilde/Otilde/OE/oe +/endash/emdash/quotedblleft/quotedblright/quoteleft/quoteright/divide +/.notdef/ydieresis/Ydieresis/fraction/currency/guilsinglleft/guilsinglright +/fi/fl/daggerdbl/periodcentered/quotesinglbase/quotedblbase/perthousand +/Acircumflex/Ecircumflex/Aacute/Edieresis/Egrave/Iacute/Icircumflex +/Idieresis/Igrave/Oacute/Ocircumflex/.notdef/Ograve/Uacute/Ucircumflex +/Ugrave/dotlessi/circumflex/tilde/macron/breve/dotaccent/ring/cedilla +/hungarumlaut/ogonek/caron +0 TE +[1/dotlessi/caron 39/quotesingle 96/grave +127/bullet/bullet/bullet/quotesinglbase/florin/quotedblbase/ellipsis +/dagger/daggerdbl/circumflex/perthousand/Scaron/guilsinglleft/OE +/bullet/bullet/bullet/bullet/quoteleft/quoteright/quotedblleft +/quotedblright/bullet/endash/emdash/tilde/trademark/scaron +/guilsinglright/oe/bullet/bullet/Ydieresis/space/exclamdown/cent/sterling +/currency/yen/brokenbar/section/dieresis/copyright/ordfeminine +/guillemotleft/logicalnot/hyphen/registered/macron/degree/plusminus +/twosuperior/threesuperior/acute/mu/paragraph/periodcentered/cedilla +/onesuperior/ordmasculine/guillemotright/onequarter/onehalf/threequarters +/questiondown/Agrave/Aacute/Acircumflex/Atilde/Adieresis/Aring/AE/Ccedilla +/Egrave/Eacute/Ecircumflex/Edieresis/Igrave/Iacute/Icircumflex/Idieresis +/Eth/Ntilde/Ograve/Oacute/Ocircumflex/Otilde/Odieresis/multiply/Oslash +/Ugrave/Uacute/Ucircumflex/Udieresis/Yacute/Thorn/germandbls/agrave +/aacute/acircumflex/atilde/adieresis/aring/ae/ccedilla/egrave/eacute +/ecircumflex/edieresis/igrave/iacute/icircumflex/idieresis/eth/ntilde +/ograve/oacute/ocircumflex/otilde/odieresis/divide/oslash/ugrave/uacute +/ucircumflex/udieresis/yacute/thorn/ydieresis +1 TE +end +currentdict readonly pop +end end +/currentpacking where {pop setpacking}if +PDFVars/InitAll{[PDF PDFText]{/initialize get exec}forall initgs 0 Tr}put +PDFVars/TermAll{[PDFText PDF]{/terminate get exec}forall}put +PDFVars begin PDF begin PDFVars/InitAll get exec +/N93 << +/SA false +/OP false +>> /ExtGState defineRes pop +% Begin encoding-delta +[ 39 /quotesingle 96 /grave 127 /.notdef/Adieresis/Aring/Ccedilla +/Eacute/Ntilde/Odieresis/Udieresis/aacute/agrave +/acircumflex/adieresis/atilde/aring/ccedilla/eacute +/egrave/ecircumflex/edieresis/iacute/igrave/icircumflex +/idieresis/ntilde/oacute/ograve/ocircumflex/odieresis +/otilde/uacute/ugrave/ucircumflex/udieresis/dagger +/.notdef/cent/sterling/section/bullet/paragraph +/germandbls/registered/copyright/trademark/acute/dieresis +/.notdef/AE/Oslash/.notdef/.notdef/.notdef +/.notdef/yen/.notdef/.notdef/.notdef/.notdef +/.notdef/.notdef/ordfeminine/ordmasculine/.notdef/ae +/oslash/questiondown/exclamdown/logicalnot/.notdef/florin +/.notdef/.notdef/guillemotleft/guillemotright/ellipsis/.notdef +/Agrave/Atilde/Otilde/OE/oe/endash +/emdash/quotedblleft/quotedblright/quoteleft/quoteright/.notdef +/.notdef/ydieresis/Ydieresis/fraction/currency/guilsinglleft +/guilsinglright/fi/fl/daggerdbl/periodcentered/quotesinglbase +/quotedblbase/perthousand/Acircumflex/Ecircumflex/Aacute/Edieresis +/Egrave/Iacute/Icircumflex/Idieresis/Igrave/Oacute +/Ocircumflex/.notdef/Ograve/Uacute/Ucircumflex/Ugrave +/dotlessi/circumflex/tilde/macron/breve/dotaccent +/ring/cedilla/hungarumlaut/ogonek/caron +/N92/Times-Roman -1 TZ +% End encoding-delta +% Begin encoding-delta +[ 39 /quotesingle 96 /grave 127 /.notdef/Adieresis/Aring/Ccedilla +/Eacute/Ntilde/Odieresis/Udieresis/aacute/agrave +/acircumflex/adieresis/atilde/aring/ccedilla/eacute +/egrave/ecircumflex/edieresis/iacute/igrave/icircumflex +/idieresis/ntilde/oacute/ograve/ocircumflex/odieresis +/otilde/uacute/ugrave/ucircumflex/udieresis/dagger +/.notdef/cent/sterling/section/bullet/paragraph +/germandbls/registered/copyright/trademark/acute/dieresis +/.notdef/AE/Oslash/.notdef/.notdef/.notdef +/.notdef/yen/.notdef/.notdef/.notdef/.notdef +/.notdef/.notdef/ordfeminine/ordmasculine/.notdef/ae +/oslash/questiondown/exclamdown/logicalnot/.notdef/florin +/.notdef/.notdef/guillemotleft/guillemotright/ellipsis/.notdef +/Agrave/Atilde/Otilde/OE/oe/endash +/emdash/quotedblleft/quotedblright/quoteleft/quoteright/.notdef +/.notdef/ydieresis/Ydieresis/fraction/currency/guilsinglleft +/guilsinglright/fi/fl/daggerdbl/periodcentered/quotesinglbase +/quotedblbase/perthousand/Acircumflex/Ecircumflex/Aacute/Edieresis +/Egrave/Iacute/Icircumflex/Idieresis/Igrave/Oacute +/Ocircumflex/.notdef/Ograve/Uacute/Ucircumflex/Ugrave +/dotlessi/circumflex/tilde/macron/breve/dotaccent +/ring/cedilla/hungarumlaut/ogonek/caron +/N91/Times-Bold -1 TZ +% End encoding-delta +% Begin encoding-delta +[ 39 /quotesingle 96 /grave 127 /.notdef/Adieresis/Aring/Ccedilla +/Eacute/Ntilde/Odieresis/Udieresis/aacute/agrave +/acircumflex/adieresis/atilde/aring/ccedilla/eacute +/egrave/ecircumflex/edieresis/iacute/igrave/icircumflex +/idieresis/ntilde/oacute/ograve/ocircumflex/odieresis +/otilde/uacute/ugrave/ucircumflex/udieresis/dagger +/.notdef/cent/sterling/section/bullet/paragraph +/germandbls/registered/copyright/trademark/acute/dieresis +/.notdef/AE/Oslash/.notdef/.notdef/.notdef +/.notdef/yen/.notdef/.notdef/.notdef/.notdef +/.notdef/.notdef/ordfeminine/ordmasculine/.notdef/ae +/oslash/questiondown/exclamdown/logicalnot/.notdef/florin +/.notdef/.notdef/guillemotleft/guillemotright/ellipsis/.notdef +/Agrave/Atilde/Otilde/OE/oe/endash +/emdash/quotedblleft/quotedblright/quoteleft/quoteright/.notdef +/.notdef/ydieresis/Ydieresis/fraction/currency/guilsinglleft +/guilsinglright/fi/fl/daggerdbl/periodcentered/quotesinglbase +/quotedblbase/perthousand/Acircumflex/Ecircumflex/Aacute/Edieresis +/Egrave/Iacute/Icircumflex/Idieresis/Igrave/Oacute +/Ocircumflex/.notdef/Ograve/Uacute/Ucircumflex/Ugrave +/dotlessi/circumflex/tilde/macron/breve/dotaccent +/ring/cedilla/hungarumlaut/ogonek/caron +/N94/Times-Italic -1 TZ +% End encoding-delta +% Begin encoding-delta +[ 39 /quotesingle 96 /grave 127 /.notdef/Adieresis/Aring/Ccedilla +/Eacute/Ntilde/Odieresis/Udieresis/aacute/agrave +/acircumflex/adieresis/atilde/aring/ccedilla/eacute +/egrave/ecircumflex/edieresis/iacute/igrave/icircumflex +/idieresis/ntilde/oacute/ograve/ocircumflex/odieresis +/otilde/uacute/ugrave/ucircumflex/udieresis/dagger +/.notdef/cent/sterling/section/bullet/paragraph +/germandbls/registered/copyright/trademark/acute/dieresis +/.notdef/AE/Oslash/.notdef/.notdef/.notdef +/.notdef/yen/.notdef/.notdef/.notdef/.notdef +/.notdef/.notdef/ordfeminine/ordmasculine/.notdef/ae +/oslash/questiondown/exclamdown/logicalnot/.notdef/florin +/.notdef/.notdef/guillemotleft/guillemotright/ellipsis/.notdef +/Agrave/Atilde/Otilde/OE/oe/endash +/emdash/quotedblleft/quotedblright/quoteleft/quoteright/.notdef +/.notdef/ydieresis/Ydieresis/fraction/currency/guilsinglleft +/guilsinglright/fi/fl/daggerdbl/periodcentered/quotesinglbase +/quotedblbase/perthousand/Acircumflex/Ecircumflex/Aacute/Edieresis +/Egrave/Iacute/Icircumflex/Idieresis/Igrave/Oacute +/Ocircumflex/.notdef/Ograve/Uacute/Ucircumflex/Ugrave +/dotlessi/circumflex/tilde/macron/breve/dotaccent +/ring/cedilla/hungarumlaut/ogonek/caron +/N95/Courier -1 TZ +% End encoding-delta +% Begin encoding-delta +[/N96/Symbol -1 TZ +% End encoding-delta +% Begin encoding-delta +[/N97/Times-Roman -1 TZ +% End encoding-delta +% Begin encoding-delta +[ 39 /quotesingle 96 /grave 127 /.notdef/Adieresis/Aring/Ccedilla +/Eacute/Ntilde/Odieresis/Udieresis/aacute/agrave +/acircumflex/adieresis/atilde/aring/ccedilla/eacute +/egrave/ecircumflex/edieresis/iacute/igrave/icircumflex +/idieresis/ntilde/oacute/ograve/ocircumflex/odieresis +/otilde/uacute/ugrave/ucircumflex/udieresis/dagger +/.notdef/cent/sterling/section/bullet/paragraph +/germandbls/registered/copyright/trademark/acute/dieresis +/.notdef/AE/Oslash/.notdef/.notdef/.notdef +/.notdef/yen/.notdef/.notdef/.notdef/.notdef +/.notdef/.notdef/ordfeminine/ordmasculine/.notdef/ae +/oslash/questiondown/exclamdown/logicalnot/.notdef/florin +/.notdef/.notdef/guillemotleft/guillemotright/ellipsis/.notdef +/Agrave/Atilde/Otilde/OE/oe/endash +/emdash/quotedblleft/quotedblright/quoteleft/quoteright/.notdef +/.notdef/ydieresis/Ydieresis/fraction/currency/guilsinglleft +/guilsinglright/fi/fl/daggerdbl/periodcentered/quotesinglbase +/quotedblbase/perthousand/Acircumflex/Ecircumflex/Aacute/Edieresis +/Egrave/Iacute/Icircumflex/Idieresis/Igrave/Oacute +/Ocircumflex/.notdef/Ograve/Uacute/Ucircumflex/Ugrave +/dotlessi/circumflex/tilde/macron/breve/dotaccent +/ring/cedilla/hungarumlaut/ogonek/caron +/N98/Courier-Bold -1 TZ +% End encoding-delta +% Begin encoding-delta +[ 39 /quotesingle 96 /grave 127 /.notdef/Adieresis/Aring/Ccedilla +/Eacute/Ntilde/Odieresis/Udieresis/aacute/agrave +/acircumflex/adieresis/atilde/aring/ccedilla/eacute +/egrave/ecircumflex/edieresis/iacute/igrave/icircumflex +/idieresis/ntilde/oacute/ograve/ocircumflex/odieresis +/otilde/uacute/ugrave/ucircumflex/udieresis/dagger +/.notdef/cent/sterling/section/bullet/paragraph +/germandbls/registered/copyright/trademark/acute/dieresis +/.notdef/AE/Oslash/.notdef/.notdef/.notdef +/.notdef/yen/.notdef/.notdef/.notdef/.notdef +/.notdef/.notdef/ordfeminine/ordmasculine/.notdef/ae +/oslash/questiondown/exclamdown/logicalnot/.notdef/florin +/.notdef/.notdef/guillemotleft/guillemotright/ellipsis/.notdef +/Agrave/Atilde/Otilde/OE/oe/endash +/emdash/quotedblleft/quotedblright/quoteleft/quoteright/.notdef +/.notdef/ydieresis/Ydieresis/fraction/currency/guilsinglleft +/guilsinglright/fi/fl/daggerdbl/periodcentered/quotesinglbase +/quotedblbase/perthousand/Acircumflex/Ecircumflex/Aacute/Edieresis +/Egrave/Iacute/Icircumflex/Idieresis/Igrave/Oacute +/Ocircumflex/.notdef/Ograve/Uacute/Ucircumflex/Ugrave +/dotlessi/circumflex/tilde/macron/breve/dotaccent +/ring/cedilla/hungarumlaut/ogonek/caron +/N133/Helvetica -1 TZ +% End encoding-delta +% Begin encoding-delta +[ 39 /quotesingle 96 /grave 127 /.notdef/Adieresis/Aring/Ccedilla +/Eacute/Ntilde/Odieresis/Udieresis/aacute/agrave +/acircumflex/adieresis/atilde/aring/ccedilla/eacute +/egrave/ecircumflex/edieresis/iacute/igrave/icircumflex +/idieresis/ntilde/oacute/ograve/ocircumflex/odieresis +/otilde/uacute/ugrave/ucircumflex/udieresis/dagger +/.notdef/cent/sterling/section/bullet/paragraph +/germandbls/registered/copyright/trademark/acute/dieresis +/.notdef/AE/Oslash/.notdef/.notdef/.notdef +/.notdef/yen/.notdef/.notdef/.notdef/.notdef +/.notdef/.notdef/ordfeminine/ordmasculine/.notdef/ae +/oslash/questiondown/exclamdown/logicalnot/.notdef/florin +/.notdef/.notdef/guillemotleft/guillemotright/ellipsis/.notdef +/Agrave/Atilde/Otilde/OE/oe/endash +/emdash/quotedblleft/quotedblright/quoteleft/quoteright/.notdef +/.notdef/ydieresis/Ydieresis/fraction/currency/guilsinglleft +/guilsinglright/fi/fl/daggerdbl/periodcentered/quotesinglbase +/quotedblbase/perthousand/Acircumflex/Ecircumflex/Aacute/Edieresis +/Egrave/Iacute/Icircumflex/Idieresis/Igrave/Oacute +/Ocircumflex/.notdef/Ograve/Uacute/Ucircumflex/Ugrave +/dotlessi/circumflex/tilde/macron/breve/dotaccent +/ring/cedilla/hungarumlaut/ogonek/caron +/N130/Helvetica-Bold -1 TZ +% End encoding-delta +PDFVars/TermAll get exec end end + +%%EndSetup +%%Page: 1 1 +%%BeginPageSetup +userdict /pgsave save put +PDFVars begin PDF begin PDFVars/InitAll get exec +/N134 << +/SA false +/OP false +>> /ExtGState defineRes pop +%%EndPageSetup +0 0 612 792 RC +1 g +/N134 /ExtGState findRes gs +1 i +0 0 612 792 re +f +BT +/N130 1 Tf +18 0 0 18 341.69 636 Tm +0 g +0 Tc +0 Tw +[(An Ef\336cient Meta-loc)20 (k f)20 (o)0 (r)]TJ +-8.0106 -1.1111 TD +[(Implementing Ubiquitous Sync)10 (hr)20 (onization)]TJ +/N133 1 Tf +12 0 0 12 319.9 562 Tm +[( Ole Agesen, Da)21 (vid Detlefs)15 (, Ale)30 (x Gar)-40 (thw)15 (aite)15 (,)]TJ +-2.7992 -1.0833 TD +[(Ross Knippel, Y)140 (.)0 ( S)21 (.)0 ( Ramakr)-15 (ishna, and Derek White)]TJ +ET +PDFVars/TermAll get exec end end +userdict /pgsave get restore +showpage +%%PageTrailer +%%EndPage +%%Page: 2 2 +%%BeginPageSetup +userdict /pgsave save put +PDFVars begin PDF begin PDFVars/InitAll get exec +/N141 << +/SA true +/OP false +>> /ExtGState defineRes pop +/N142 << +/SA false +/OP false +>> /ExtGState defineRes pop +%%EndPageSetup +0 0 612 792 RC +1 g +/N142 /ExtGState findRes gs +1 i +0 0 612 792 re +f +0 g +/N141 /ExtGState findRes gs +168.891 99.828 m +168.891 99.172 168.383 98.66 167.723 98.66 c +167.063 98.66 166.554 99.172 166.554 99.828 c +166.554 100.485 167.063 100.997 167.723 100.997 c +168.383 100.997 168.891 100.485 168.891 99.828 c +h +166.737 99.828 m +166.737 99.269 167.158 98.826 167.723 98.826 c +168.288 98.826 168.709 99.269 168.709 99.828 c +168.709 100.387 168.288 100.83 167.723 100.83 c +167.158 100.83 166.737 100.387 166.737 99.828 c +h +167.465 99.153 m +167.299 99.153 l +167.299 100.503 l +167.817 100.503 l +168.122 100.503 168.257 100.371 168.257 100.13 c +168.257 99.891 168.1 99.784 167.918 99.753 c +168.32 99.153 l +168.125 99.153 l +167.745 99.753 l +167.465 99.753 l +167.465 99.153 l +h +167.685 99.894 m +167.889 99.894 168.09 99.9 168.09 100.13 c +168.09 100.318 167.933 100.362 167.77 100.362 c +167.465 100.362 l +167.465 99.894 l +167.685 99.894 l +f +122.149 116.628 m +122.042 111.155 128.194 108.58 128.087 103.787 c +128.087 102.463 127.086 99.029 123.58 99.101 c +121.827 99.137 120.146 100.317 120.181 103.357 c +120.181 105.504 l +120.146 105.682 120.146 105.897 120.003 106.076 c +119.895 106.183 119.681 106.255 119.502 106.183 c +119.18 106.004 119.108 105.647 118.965 105.36 c +118.786 104.43 118.572 103.465 118.393 102.535 c +118.357 102.463 118.465 102.463 118.393 102.463 c +118.25 101.784 118.143 101.068 118.107 100.388 c +118.071 99.959 118.357 99.602 118.715 99.387 c +120.038 98.672 121.72 98.171 123.58 98.135 c +128.123 98.064 131.843 102.034 131.915 105.897 c +131.95 109.009 130.198 110.797 128.266 113.122 c +127.3 114.338 125.368 116.02 125.44 118.237 c +125.44 120.026 126.907 122.029 129.59 121.993 c +133.203 121.922 132.63 118.094 132.702 116.341 c +132.702 116.163 132.702 115.984 132.845 115.912 c +132.988 115.769 133.203 115.733 133.382 115.805 c +133.489 115.912 133.56 115.984 133.596 116.091 c +134.49 119.453 l +134.634 120.061 134.92 120.67 134.92 121.385 c +134.92 121.564 134.777 121.743 134.598 121.886 c +132.702 122.887 131.557 123.03 129.804 123.066 c +125.368 123.138 122.185 119.99 122.149 116.628 c +h +132.959 109.975 m +132.959 109.796 133.103 109.617 133.317 109.617 c +133.782 109.617 134.033 110.547 135.177 112.085 c +135.821 113.051 136.644 113.802 137.073 113.802 c +137.395 113.802 137.431 113.552 137.431 113.337 c +137.431 112.264 133.353 103.679 133.353 100.317 c +133.353 99.28 133.782 98.278 135.106 98.278 c +138.254 98.278 141.652 102.249 144.085 105.825 c +144.192 105.933 144.263 105.933 144.228 105.754 c +143.727 104.287 143.011 101.497 143.011 99.78 c +143.011 99.065 143.155 98.207 144.299 98.207 c +147.089 98.207 150.703 103.679 150.703 104.967 c +150.703 105.182 150.631 105.396 150.381 105.396 c +149.951 105.396 149.808 104.681 149.343 103.894 c +148.484 102.535 147.089 100.746 146.41 100.746 c +146.159 100.746 146.159 101.032 146.159 101.247 c +146.159 103.071 150.524 114.982 150.524 115.34 c +150.524 115.626 150.488 115.805 150.059 115.805 c +149.773 115.805 149.307 115.841 148.663 115.841 c +147.483 115.841 147.411 115.662 147.304 115.447 c +146.732 114.338 145.873 110.797 144.156 107.685 c +141.831 103.644 138.683 100.46 137.359 100.46 c +136.894 100.46 136.608 100.818 136.608 101.354 c +136.608 103.143 140.937 112.407 140.937 114.66 c +140.937 115.197 140.65 116.27 139.363 116.27 c +135.893 116.27 132.959 110.618 132.959 109.975 c +h +165.701 116.234 m +162.231 116.234 158.546 110.726 157.294 108.866 c +157.223 108.687 157.187 108.758 157.223 108.901 c +157.831 111.119 158.546 113.087 158.546 114.553 c +158.546 115.447 158.117 116.341 156.972 116.341 c +154.54 116.341 150.998 111.548 150.998 110.368 c +150.998 110.118 151.07 109.939 151.249 109.939 c +151.678 109.939 152.036 110.762 152.68 111.62 c +153.503 112.622 154.504 113.909 155.112 113.909 c +155.22 113.909 155.327 113.838 155.327 113.48 c +155.327 111.978 152.25 103.5 150.963 99.28 c +150.819 98.815 151.106 98.707 152.465 98.707 c +153.431 98.707 153.896 98.564 154.075 99.315 c +154.862 102.141 156.328 105.289 157.938 108.043 c +159.405 110.618 162.553 114.338 164.306 114.338 c +164.699 114.338 164.95 114.052 164.95 113.444 c +164.95 111.656 160.371 103.107 160.514 99.852 c +160.55 98.922 160.872 98.242 162.052 98.242 c +164.771 98.242 167.167 102.034 168.276 104.037 c +168.455 104.359 168.741 104.717 168.491 105.11 c +168.42 105.253 168.276 105.325 168.098 105.325 c +167.99 105.325 167.883 105.217 167.811 105.11 c +167.06 103.787 165.415 100.889 164.306 100.818 c +164.091 100.818 163.984 101.032 163.984 101.211 c +163.984 102.499 167.99 110.833 167.99 113.659 c +167.99 115.555 166.953 116.234 165.701 116.234 c +f +90.144 108.471 m +89.113 107.44 87.442 107.441 86.411 108.471 c +85.381 109.502 85.381 111.174 86.411 112.205 c +92.524 118.316 l +94.159 116.681 l +88.003 110.527 l +88.465 110.065 l +94.621 116.219 l +96.257 114.584 l +90.144 108.471 l +f +94.904 112.271 m +95.934 113.301 97.605 113.301 98.636 112.271 c +99.667 111.24 99.667 109.568 98.636 108.537 c +92.524 102.426 l +90.888 104.06 l +97.044 110.215 l +96.582 110.677 l +90.427 104.522 l +88.79 106.158 l +94.904 112.271 l +f +102.829 99.589 m +103.86 98.559 103.86 96.888 102.829 95.857 c +101.798 94.827 100.126 94.827 99.095 95.857 c +92.983 101.969 l +94.618 103.604 l +100.773 97.449 l +101.235 97.911 l +95.08 104.066 l +96.716 105.702 l +102.829 99.589 l +f +99.029 104.349 m +97.998 105.379 97.998 107.05 99.029 108.081 c +100.06 109.111 101.732 109.111 102.763 108.081 c +108.875 101.969 l +107.24 100.334 l +101.085 106.489 l +100.623 106.027 l +106.778 99.872 l +105.142 98.236 l +99.029 104.349 l +f +106.957 108.474 m +105.926 107.443 104.255 107.443 103.225 108.474 c +102.194 109.505 102.194 111.177 103.225 112.207 c +109.337 118.319 l +110.972 116.684 l +104.816 110.529 l +105.278 110.067 l +111.434 116.222 l +113.07 114.586 l +106.957 108.474 l +f +111.717 112.273 m +112.747 113.304 114.418 113.304 115.449 112.273 c +116.48 111.242 116.48 109.57 115.449 108.54 c +109.337 102.428 l +107.702 104.063 l +113.857 110.218 l +113.395 110.68 l +107.24 104.525 l +105.603 106.161 l +111.717 112.273 l +f +102.829 116.398 m +103.86 115.368 103.86 113.697 102.829 112.666 c +101.798 111.636 100.126 111.635 99.095 112.666 c +92.983 118.778 l +94.618 120.413 l +100.773 114.258 l +101.235 114.72 l +95.08 120.875 l +96.716 122.511 l +102.829 116.398 l +f +99.029 121.158 m +97.998 122.188 97.998 123.859 99.029 124.889 c +100.06 125.92 101.732 125.92 102.763 124.89 c +108.875 118.778 l +107.24 117.143 l +101.085 123.298 l +100.623 122.836 l +106.778 116.681 l +105.142 115.045 l +99.029 121.158 l +f +124.334 91.075 m +124.334 93.452 l +124.334 93.966 124.211 94.294 123.712 94.294 c +123.168 94.294 122.646 93.735 122.646 92.841 c +122.646 91.075 l +121.963 91.075 l +121.963 93.46 l +121.963 93.921 121.863 94.294 121.341 94.294 c +120.757 94.294 120.274 93.69 120.274 92.841 c +120.274 91.075 l +119.583 91.075 l +119.583 94.778 l +120.251 94.778 l +120.251 94.562 120.236 94.227 120.189 94.003 c +120.205 93.996 l +120.427 94.518 120.919 94.853 121.533 94.853 c +122.339 94.853 122.553 94.309 122.584 94.011 c +122.745 94.361 123.152 94.853 123.889 94.853 c +124.611 94.853 125.025 94.473 125.025 93.638 c +125.025 91.075 l +124.334 91.075 l +f +130.241 91.023 m +129.328 91.023 128.568 91.44 128.568 92.804 c +128.568 93.996 129.197 94.838 130.472 94.838 c +130.771 94.838 131.07 94.793 131.339 94.719 c +131.262 94.13 l +131.024 94.219 130.74 94.286 130.441 94.286 c +129.673 94.286 129.297 93.735 129.297 92.893 c +129.297 92.171 129.543 91.597 130.387 91.597 c +130.709 91.597 131.047 91.671 131.293 91.798 c +131.347 91.217 l +131.093 91.12 130.686 91.023 130.241 91.023 c +f +134.271 94.19 m +133.465 94.324 133.081 93.75 133.081 92.565 c +133.081 91.075 l +132.39 91.075 l +132.39 94.778 l +133.058 94.778 l +133.058 94.547 133.035 94.205 132.974 93.899 c +132.989 93.899 l +133.15 94.413 133.542 94.927 134.302 94.845 c +134.271 94.19 l +f +136.617 91.008 m +135.581 91.008 134.906 91.589 134.906 92.893 c +134.906 93.981 135.612 94.845 136.763 94.845 c +137.738 94.845 138.483 94.324 138.483 92.99 c +138.483 91.873 137.746 91.008 136.617 91.008 c +h +136.702 94.286 m +136.149 94.286 135.627 93.877 135.627 92.96 c +135.627 92.066 136.003 91.575 136.702 91.575 c +137.27 91.575 137.761 92.007 137.761 92.953 c +137.761 93.795 137.408 94.286 136.702 94.286 c +f +140.27 91.008 m +139.986 91.008 139.702 91.031 139.464 91.075 c +139.487 91.671 l +139.725 91.597 140.032 91.545 140.324 91.545 c +140.83 91.545 141.176 91.768 141.176 92.111 c +141.176 92.938 139.403 92.431 139.403 93.735 c +139.403 94.361 139.932 94.845 140.93 94.845 c +141.16 94.845 141.429 94.815 141.659 94.771 c +141.644 94.212 l +141.406 94.286 141.13 94.331 140.877 94.331 c +140.37 94.331 140.117 94.123 140.117 93.802 c +140.117 92.99 141.897 93.43 141.897 92.186 c +141.897 91.515 141.283 91.008 140.27 91.008 c +f +144.388 90.703 m +143.943 89.556 143.583 89.183 142.738 89.183 c +142.592 89.183 142.393 89.205 142.239 89.228 c +142.301 89.831 l +142.462 89.779 142.615 89.757 142.792 89.757 c +143.214 89.757 143.437 89.943 143.705 90.658 c +143.866 91.075 l +142.424 94.778 l +143.206 94.778 l +143.928 92.856 l +144.051 92.521 144.135 92.23 144.22 91.94 c +144.235 91.94 l +144.304 92.2 144.465 92.67 144.611 93.08 c +145.225 94.778 l +145.977 94.778 l +144.388 90.703 l +f +147.397 91.008 m +147.113 91.008 146.829 91.031 146.591 91.075 c +146.614 91.671 l +146.852 91.597 147.159 91.545 147.45 91.545 c +147.957 91.545 148.302 91.768 148.302 92.111 c +148.302 92.938 146.529 92.431 146.529 93.735 c +146.529 94.361 147.059 94.845 148.057 94.845 c +148.287 94.845 148.556 94.815 148.786 94.771 c +148.77 94.212 l +148.533 94.286 148.256 94.331 148.003 94.331 c +147.496 94.331 147.243 94.123 147.243 93.802 c +147.243 92.99 149.024 93.43 149.024 92.186 c +149.024 91.515 148.41 91.008 147.397 91.008 c +f +151.329 91.016 m +150.546 91.016 150.308 91.329 150.308 92.126 c +150.308 94.257 l +149.525 94.257 l +149.525 94.778 l +150.308 94.778 l +150.308 95.978 l +150.999 96.164 l +150.999 94.778 l +152.066 94.778 l +152.066 94.257 l +150.999 94.257 l +150.999 92.364 l +150.999 91.753 151.114 91.589 151.544 91.589 c +151.72 91.589 151.912 91.619 152.066 91.657 c +152.066 91.09 l +151.851 91.046 151.574 91.016 151.329 91.016 c +f +156.044 92.968 m +153.549 92.968 l +153.519 91.977 153.941 91.552 154.754 91.552 c +155.146 91.552 155.56 91.642 155.867 91.783 c +155.929 91.239 l +155.56 91.09 155.115 91.008 154.639 91.008 c +153.465 91.008 152.836 91.604 152.836 92.908 c +152.836 94.018 153.457 94.845 154.54 94.845 c +155.614 94.845 156.067 94.137 156.067 93.288 c +156.067 93.199 156.059 93.094 156.044 92.968 c +h +154.509 94.354 m +154.01 94.354 153.657 93.989 153.58 93.445 c +155.361 93.445 l +155.376 93.989 155.046 94.354 154.509 94.354 c +f +161.845 91.075 m +161.845 93.452 l +161.845 93.966 161.722 94.294 161.223 94.294 c +160.678 94.294 160.156 93.735 160.156 92.841 c +160.156 91.075 l +159.473 91.075 l +159.473 93.46 l +159.473 93.921 159.373 94.294 158.851 94.294 c +158.268 94.294 157.784 93.69 157.784 92.841 c +157.784 91.075 l +157.093 91.075 l +157.093 94.778 l +157.761 94.778 l +157.761 94.562 157.746 94.227 157.7 94.003 c +157.715 93.996 l +157.938 94.518 158.429 94.853 159.043 94.853 c +159.849 94.853 160.064 94.309 160.095 94.011 c +160.256 94.361 160.663 94.853 161.399 94.853 c +162.121 94.853 162.535 94.473 162.535 93.638 c +162.535 91.075 l +161.845 91.075 l +f +164.433 91.008 m +164.149 91.008 163.865 91.031 163.627 91.075 c +163.65 91.671 l +163.888 91.597 164.195 91.545 164.487 91.545 c +164.994 91.545 165.339 91.768 165.339 92.111 c +165.339 92.938 163.566 92.431 163.566 93.735 c +163.566 94.361 164.096 94.845 165.093 94.845 c +165.324 94.845 165.592 94.815 165.823 94.771 c +165.807 94.212 l +165.569 94.286 165.293 94.331 165.04 94.331 c +164.533 94.331 164.28 94.123 164.28 93.802 c +164.28 92.99 166.061 93.43 166.061 92.186 c +166.061 91.515 165.447 91.008 164.433 91.008 c +f +126.966 95.389 m +127.219 95.389 127.426 95.59 127.426 95.829 c +127.426 96.075 127.219 96.261 126.966 96.261 c +126.713 96.261 126.505 96.067 126.505 95.829 c +126.505 95.583 126.713 95.389 126.966 95.389 c +f +126.062 94.787 m +126.062 94.265 l +126.62 94.265 l +126.62 91.075 l +127.311 91.075 l +127.311 94.778 l +126.062 94.787 l +f +168.659 79.658 m +168.659 79.388 l +168.309 79.388 l +168.309 78.062 l +167.99 78.062 l +167.99 79.388 l +167.64 79.388 l +167.64 79.658 l +168.659 79.658 l +f +170.491 78.062 m +170.153 78.062 l +170.146 78.478 170.134 78.892 170.099 79.32 c +170.094 79.32 l +169.734 78.062 l +169.496 78.062 l +169.148 79.32 l +169.143 79.32 l +169.108 78.892 169.096 78.478 169.089 78.062 c +168.786 78.062 l +168.8 78.594 168.838 79.128 168.89 79.658 c +169.361 79.658 l +169.647 78.627 l +169.652 78.627 l +169.947 79.658 l +170.387 79.658 l +170.439 79.128 170.477 78.594 170.491 78.062 c +f +89.088 79.647 m +89.088 78.949 l +88.181 78.949 l +88.181 75.513 l +87.354 75.513 l +87.354 78.949 l +86.448 78.949 l +86.448 79.647 l +89.088 79.647 l +f +90.342 79.647 m +90.342 78.018 l +91.745 78.018 l +91.745 79.647 l +92.571 79.647 l +92.571 75.513 l +91.745 75.513 l +91.745 77.369 l +90.342 77.369 l +90.342 75.513 l +89.515 75.513 l +89.515 79.647 l +90.342 79.647 l +f +95.322 79.647 m +95.322 78.961 l +94.018 78.961 l +94.018 77.944 l +95.102 77.944 l +95.102 77.295 l +94.018 77.295 l +94.018 76.211 l +95.347 76.211 l +95.347 75.513 l +93.191 75.513 l +93.191 79.647 l +95.322 79.647 l +f +98.582 79.647 m +99.905 76.977 l +99.905 79.647 l +100.615 79.647 l +100.615 75.513 l +99.709 75.513 l +98.208 78.581 l +98.208 75.513 l +97.498 75.513 l +97.498 79.647 l +98.582 79.647 l +f +103.366 79.647 m +103.366 78.961 l +102.062 78.961 l +102.062 77.944 l +103.146 77.944 l +103.146 77.295 l +102.062 77.295 l +102.062 76.211 l +103.391 76.211 l +103.391 75.513 l +101.235 75.513 l +101.235 79.647 l +103.366 79.647 l +f +106.394 79.647 m +106.394 78.949 l +105.488 78.949 l +105.488 75.513 l +104.661 75.513 l +104.661 78.949 l +103.755 78.949 l +103.755 79.647 l +106.394 79.647 l +f +108.215 76.683 m +108.888 79.647 l +109.623 79.647 l +110.297 76.683 l +110.309 76.683 l +110.536 77.681 110.738 78.704 110.866 79.647 c +111.613 79.647 l +111.399 78.257 111.081 76.842 110.658 75.513 c +109.825 75.513 l +109.201 78.201 l +109.188 78.201 l +108.551 75.513 l +107.682 75.513 l +107.278 76.885 106.971 78.269 106.775 79.647 c +107.657 79.647 l +107.768 78.808 107.976 77.785 108.202 76.683 c +108.215 76.683 l +f +115.201 77.687 m +115.201 75.973 114.325 75.452 113.486 75.452 c +112.457 75.452 111.821 76.095 111.821 77.522 c +111.821 78.973 112.549 79.702 113.548 79.702 c +114.552 79.702 115.201 79.028 115.201 77.687 c +h +112.696 77.687 m +112.696 76.518 113.039 76.132 113.535 76.132 c +113.97 76.132 114.325 76.493 114.325 77.516 c +114.325 78.624 113.964 79.022 113.474 79.022 c +113.052 79.022 112.696 78.655 112.696 77.687 c +f +117.056 79.665 m +118.06 79.665 118.52 79.193 118.52 78.489 c +118.52 77.975 118.213 77.54 117.638 77.326 c +118.673 75.513 l +117.687 75.513 l +116.842 77.191 l +116.658 77.191 l +116.658 75.513 l +115.831 75.513 l +115.831 79.647 l +116.266 79.659 116.67 79.665 117.056 79.665 c +h +116.658 77.846 m +116.793 77.834 116.909 77.834 117.019 77.834 c +117.399 77.834 117.662 78.091 117.662 78.41 c +117.662 78.777 117.485 79.01 116.97 79.01 c +116.872 79.01 116.762 79.004 116.658 78.998 c +116.658 77.846 l +f +120.025 79.647 m +120.025 77.914 l +121.103 79.647 l +122.046 79.647 l +120.699 77.803 l +122.12 75.513 l +121.091 75.513 l +120.025 77.399 l +120.025 75.513 l +119.199 75.513 l +119.199 79.647 l +120.025 79.647 l +f +123.818 79.647 0.827 -4.133 re +f +127.487 78.857 m +127.278 78.961 126.984 79.059 126.721 79.059 c +126.36 79.059 126.151 78.838 126.151 78.575 c +126.151 77.785 127.891 77.999 127.891 76.695 c +127.891 76.021 127.401 75.44 126.409 75.44 c +125.98 75.44 125.582 75.574 125.196 75.844 c +125.435 76.322 l +125.655 76.217 125.962 76.101 126.305 76.101 c +126.782 76.101 126.972 76.315 126.972 76.585 c +126.972 77.369 125.294 77.099 125.294 78.477 c +125.294 79.175 125.796 79.696 126.611 79.696 c +127.052 79.696 127.419 79.555 127.713 79.31 c +127.487 78.857 l +f +142.943 76.23 m +142.637 75.752 142.214 75.458 141.688 75.458 c +140.769 75.458 140.144 76.107 140.144 77.491 c +140.144 78.838 140.818 79.69 141.749 79.69 c +142.269 79.69 142.667 79.383 142.882 78.924 c +142.514 78.593 l +142.288 78.844 142.104 78.967 141.804 78.967 c +141.381 78.967 141.026 78.434 141.026 77.589 c +141.026 76.542 141.4 76.187 141.877 76.187 c +142.104 76.187 142.343 76.328 142.569 76.56 c +142.943 76.23 l +f +146.646 77.687 m +146.646 75.973 145.77 75.452 144.931 75.452 c +143.902 75.452 143.265 76.095 143.265 77.522 c +143.265 78.973 143.994 79.702 144.992 79.702 c +145.997 79.702 146.646 79.028 146.646 77.687 c +h +144.141 77.687 m +144.141 76.518 144.484 76.132 144.98 76.132 c +145.415 76.132 145.77 76.493 145.77 77.516 c +145.77 78.624 145.409 79.022 144.919 79.022 c +144.496 79.022 144.141 78.655 144.141 77.687 c +f +151.521 75.513 m +150.646 75.513 l +150.627 76.591 150.597 77.663 150.505 78.771 c +150.493 78.771 l +149.562 75.513 l +148.943 75.513 l +148.043 78.771 l +148.031 78.771 l +147.939 77.663 147.908 76.591 147.89 75.513 c +147.106 75.513 l +147.143 76.891 147.241 78.275 147.375 79.647 c +148.594 79.647 l +149.335 76.977 l +149.347 76.977 l +150.113 79.647 l +151.252 79.647 l +151.387 78.275 151.485 76.891 151.521 75.513 c +f +153.268 79.659 m +154.376 79.659 154.891 79.132 154.891 78.385 c +154.891 77.577 154.321 76.971 153.115 76.971 c +152.998 76.971 l +152.998 75.513 l +152.172 75.513 l +152.172 79.647 l +152.57 79.647 152.919 79.659 153.268 79.659 c +h +152.998 77.601 m +153.188 77.601 l +153.721 77.601 154.009 77.828 154.009 78.293 c +154.009 78.746 153.77 79.004 153.157 79.004 c +152.998 79.004 l +152.998 77.601 l +f +156.207 79.647 m +156.207 77.044 l +156.207 76.346 156.537 76.113 156.929 76.113 c +157.352 76.113 157.64 76.364 157.64 77.007 c +157.64 79.647 l +158.442 79.647 l +158.442 77.063 l +158.442 75.856 157.732 75.452 156.838 75.452 c +155.894 75.452 155.38 75.887 155.38 76.983 c +155.38 79.647 l +156.207 79.647 l +f +161.492 79.647 m +161.492 78.949 l +160.585 78.949 l +160.585 75.513 l +159.758 75.513 l +159.758 78.949 l +158.852 78.949 l +158.852 79.647 l +161.492 79.647 l +f +164.051 79.647 m +164.051 78.961 l +162.746 78.961 l +162.746 77.944 l +163.83 77.944 l +163.83 77.295 l +162.746 77.295 l +162.746 76.211 l +164.075 76.211 l +164.075 75.513 l +161.919 75.513 l +161.919 79.647 l +164.051 79.647 l +f +165.856 79.665 m +166.86 79.665 167.32 79.193 167.32 78.489 c +167.32 77.975 167.013 77.54 166.438 77.326 c +167.473 75.513 l +166.487 75.513 l +165.642 77.191 l +165.458 77.191 l +165.458 75.513 l +164.631 75.513 l +164.631 79.647 l +165.066 79.659 165.47 79.665 165.856 79.665 c +h +165.458 77.846 m +165.593 77.834 165.709 77.834 165.819 77.834 c +166.199 77.834 166.462 78.091 166.462 78.41 c +166.462 78.777 166.285 79.01 165.77 79.01 c +165.672 79.01 165.562 79.004 165.458 78.998 c +165.458 77.846 l +f +131.944 79.647 m +131.944 78.949 l +131.037 78.949 l +131.037 75.513 l +130.211 75.513 l +130.211 78.949 l +129.304 78.949 l +129.304 79.647 l +131.944 79.647 l +f +133.198 79.647 m +133.198 78.018 l +134.601 78.018 l +134.601 79.647 l +135.428 79.647 l +135.428 75.513 l +134.601 75.513 l +134.601 77.369 l +133.198 77.369 l +133.198 75.513 l +132.372 75.513 l +132.372 79.647 l +133.198 79.647 l +f +138.178 79.647 m +138.178 78.961 l +136.874 78.961 l +136.874 77.944 l +137.958 77.944 l +137.958 77.295 l +136.874 77.295 l +136.874 76.211 l +138.203 76.211 l +138.203 75.513 l +136.047 75.513 l +136.047 79.647 l +138.178 79.647 l +f +1 g +/N142 /ExtGState findRes gs +85.5 83.15 85.71 -19.48 re +f* +118.5 81.5 104.5 -34.25 re +f* +BT +/N92 1 Tf +9 0 0 9 118.5 75.5 Tm +0 g +0 Tc +0 Tw +(M/S MTV29-01)Tj +0 -1 TD +(901 San Antonio Road)Tj +T* +[(P)15 (alo Alto, CA 94303-4900)]TJ +/N130 1 Tf +18 0 0 18 85 672 Tm +[(An Ef\336cient Meta-loc)20 (k f)20 (o)0 (r)]TJ +0 -1.1111 TD +[(Implementing Ubiquitous Sync)10 (hr)20 (onization)]TJ +/N133 1 Tf +12 0 0 12 85 606 Tm +[(Ole Agesen, Da)20 (vid Detlefs)16 (, Ale)29 (x Gar)-40 (thw)15 (aite)15 (,)]TJ +0 -1.3333 TD +[(Ross Knippel, Y)140 (.)0 ( S)21 (.)0 ( Ramakr)-15 (ishna, and Derek White)]TJ +9 0 0 9 85 533 Tm +[(SMLI TR-99-76)-9165 (Apr)-15 (il 1999)]TJ +/N130 1 Tf +11 0 0 11 85 480.67 Tm +(Abstract:)Tj +/N133 1 Tf +10 0 0 10 85 458.33 Tm +[(Prog)10 (r)10 (ams)-431 (wr)-15 (itten)-431 (in)-431 (concurrent)-431 (object-or)-15 (iented)-431 (languages)15 (,)-431 (especially)-431 (ones)-432 (that)-431 (emplo)30 (y)-431 (threadsaf)30 (e)]TJ +0 -1.2 TD +[(reusab)20 (le)-277 (class)-277 (libr)10 (ar)-15 (ies)15 (,)-277 (can)-278 (e)31 (x)30 (ecute)-278 (synchronization)-277 (oper)10 (ations)-277 (\(loc)20 (k,)-277 (notify)100 (,)-277 (etc.\))-277 (at)-277 (an)-277 (amazing)-278 (r)10 (ate)15 (.)]TJ +T* +[(Unless)-365 (implemented)-364 (with)-365 (utmost)-364 (care)15 (,)-365 (synchronization)-365 (can)-364 (become)-365 (a)-364 (perf)30 (or)-25 (mance)-365 (bottlenec)20 (k.)-365 (Fur-)]TJ +T* +[(ther)-25 (more)15 (,)-431 (i)0 (n)-432 (languages)-431 (where)-432 (e)30 (v)25 (er)-30 (y)-431 (object)-432 (ma)30 (y)-431 (h)0 (a)20 (v)25 (e)-432 (its)-431 (o)15 (w)0 (n)-431 (monitor)50 (,)-432 (per-object)-431 (space)-432 (o)15 (v)25 (erhead)]TJ +T* +[(m)10 (ust)-275 (be)-275 (minimiz)15 (ed.)-275 (T)120 (o)-274 (address)-275 (these)-275 (concer)-25 (ns)15 (,)-275 (w)10 (e)-275 (h)0 (a)20 (v)25 (e)-275 (d)0 (e)30 (v)25 (eloped)-274 (a)-275 (meta-loc)20 (k)-275 (t)0 (o)-275 (mediate)-275 (access)-275 (to)]TJ +T* +[(synchronization)-256 (data.)-257 (The)-256 (meta-loc)20 (k)-256 (i)0 (s)-256 (f)30 (ast)-257 (\(loc)20 (k)-256 (+)-256 (unloc)20 (k)-256 (e)30 (x)30 (ecutes)-257 (in)-256 (11)-256 (SP)120 (ARC\252)-256 (instr)-15 (uctions\),)-257 (com-)]TJ +T* +[(pact)-288 (\(uses)-288 (only)-288 (tw)10 (o)-288 (bits)-288 (of)-288 (space\),)-288 (rob)20 (ust)-288 (under)-288 (contention)-288 (\(no)-288 (b)19 (usy-w)15 (aiting\),)-287 (and)-289 (\337e)30 (xib)20 (le)-288 (\(suppor)-40 (ts)-288 (a)]TJ +T* +[(v)25 (a)0 (r)-15 (iety)-394 (of)-393 (higher-le)30 (v)25 (e)0 (l)-393 (synchronization)-394 (oper)10 (ations\).)-393 (W)30 (e)-394 (ha)20 (v)25 (e)-393 (v)25 (alidated)-394 (the)-393 (meta-loc)20 (k)-394 (with)-393 (an)-394 (imple-)]TJ +T* +[(mentation)-501 (of)-501 (the)-501 (synchronization)-500 (oper)9 (ations)-500 (in)-501 (a)-501 (high-perf)30 (or)-25 (mance)-501 (product-quality)-501 (J)20 (a)20 (v)25 (a\252)-501 (vir)-40 (tual)]TJ +T* +[(machine and repor)-40 (t perf)30 (or)-25 (mance data f)30 (or se)30 (v)25 (e)0 (r)10 (al large prog)10 (r)10 (ams)15 (.)]TJ +ET +1 g +441 117 112.71 -67.5 re +f* +BT +/N91 1 Tf +8 0 0 8 441 111.67 Tm +0 g +[(email addr)18 (ess:)]TJ +/N92 1 Tf +0 -1.25 TD +(ole.agesen@sun.com)Tj +0 -1.125 TD +[(da)20 (vid.detlefs@sun.com)]TJ +T* +[(ale)15 (x.garthw)11 (aite@sun.com)]TJ +T* +(ross.knippel@sun.com)Tj +T* +[(y)65 (.s.ramakrishna@sun.com)]TJ +T* +(derek.white@sun.com)Tj +ET +PDFVars/TermAll get exec end end +userdict /pgsave get restore +showpage +%%PageTrailer +%%EndPage +%%Page: 3 3 +%%BeginPageSetup +userdict /pgsave save put +PDFVars begin PDF begin PDFVars/InitAll get exec +/N9 << +/SA false +/OP false +>> /ExtGState defineRes pop +%%EndPageSetup +0 0 612 792 RC +1 g +/N9 /ExtGState findRes gs +1 i +0 0 612 792 re +f +72 243 468 -207 re +f* +BT +/N92 1 Tf +8 0 0 8 72 237.67 Tm +0 g +0 Tc +[(\251)-328 (1999)-328 (Sun)-328 (Microsystems,)-328 (Inc.)-328 (All)-327 (rights)-328 (reserv)15 (ed.)-327 (The)-328 (SML)-328 (T)70 (echnical)-328 (Report)-327 (Series)-329 (is)-327 (published)-329 (by)-328 (Sun)-328 (Microsystems)-327 (Laboratories,)-329 (of)-327 (Sun)]TJ +0 -1.25 TD +0 Tw +(Microsystems, Inc. Printed in U.S.A.)Tj +0 -2.5 TD +[(Unlimited)-238 (cop)10 (ying)-237 (without)-237 (fee)-238 (is)-237 (permitted)-237 (pro)14 (vided)-237 (that)-238 (the)-237 (copies)-238 (are)-238 (not)-237 (made)-237 (nor)-237 (distrib)20 (uted)-238 (for)-237 (direct)-237 (commercial)-238 (adv)25 (antage,)-238 (and)-237 (credit)-237 (to)-237 (the)]TJ +0 -1.25 TD +[(source)-263 (is)-263 (gi)25 (v)15 (en.)-262 (Otherwise,)-263 (no)-263 (part)-263 (of)-263 (this)-263 (w)11 (ork)-263 (co)15 (v)15 (ered)-263 (by)-263 (cop)10 (yright)-262 (hereon)-263 (may)-262 (be)-264 (reproduced)-262 (in)-262 (an)14 (y)-262 (form)-262 (or)-262 (by)-263 (an)15 (y)-263 (means)-263 (graphic,)-264 (electronic,)]TJ +T* +[(or)-238 (mechanical,)-240 (including)-238 (photocop)9 (ying,)-238 (recording,)-239 (taping,)-239 (or)-240 (storage)-238 (in)-240 (an)-239 (information)-239 (retrie)25 (v)25 (a)0 (l)-239 (system,)-238 (without)-239 (the)-239 (prior)-239 (written)-238 (permission)-240 (of)-238 (t)]TJ +57.5555 0 TD +(he)Tj +-57.5555 -1.25 TD +[(cop)10 (yright o)25 (wner)54 (.)]TJ +0 -2.5 TD +(TRADEMARKS)Tj +0 -1.25 TD +[(Sun,)-347 (Sun)-345 (Microsystems,)-347 (the)-346 (Sun)-347 (logo,)-346 (Solaris,)-346 (Ja)19 (v)25 (a)0 (,)-346 (JDK,)-346 (Ja)20 (v)25 (a)-346 (W)80 (e)0 (b)-346 (Serv)14 (er)41 (,)-346 (and)-346 (Ja)19 (v)25 (a)-346 (HotSpot)-346 (are)-345 (trademarks)-346 (or)-347 (re)16 (gistered)-347 (trademarks)-346 (of)-346 (Sun)]TJ +T* +[(Microsystems,)-279 (Inc.)-281 (in)-280 (the)-279 (U.S.)-280 (and)-279 (other)-280 (countries.)-280 (All)-278 (SP)92 (ARC)-280 (trademarks)-280 (are)-279 (used)-280 (under)-279 (license)-279 (and)-280 (are)-280 (trademarks)-280 (or)-280 (re)16 (gistered)-280 (trademarks)-280 (of)]TJ +T* +[(SP)92 (ARC)-277 (International,)-278 (Inc.)-277 (in)-277 (the)-277 (U.S.)-277 (and)-277 (other)-276 (countries.)-277 (Products)-278 (bearing)-277 (SP)92 (ARC)-277 (trademarks)-277 (are)-277 (based)-277 (upon)-278 (an)-277 (architecture)-276 (de)25 (v)15 (eloped)-278 (by)-276 (Sun)]TJ +T* +(Microsystems, Inc.)Tj +0 -2.5 TD +[(F)15 (or information re)16 (garding the SML T)69 (echnical Report Series, contact Jeanie T)36 (reichel, Editor)20 (-in-Chief .)Tj +ET +PDFVars/TermAll get exec end end +userdict /pgsave get restore +showpage +%%PageTrailer +%%EndPage +%%Page: 4 4 +%%BeginPageSetup +userdict /pgsave save put +PDFVars begin PDF begin PDFVars/InitAll get exec +%%EndPageSetup +0 0 612 792 RC +1 g +/N93 /ExtGState findRes gs +1 i +0 0 612 792 re +f +BT +/N92 1 Tf +12 0 0 12 303 34.17 Tm +0 g +0 Tc +0 Tw +(1)Tj +/N91 1 Tf +18 0 0 18 204.99 664 Tm +[(An Ef\336cient Meta-lock f)24 (o)0 (r)]TJ +-3.4917 -1.2222 TD +[(Implementing Ubiquitous Synchr)18 (onization)]TJ +/N92 1 Tf +14 0 0 14 170.78 582.67 Tm +(Ole Agesen)Tj +11.2 0 0 11.2 236.48 588.27 Tm +(*)Tj +14 0 0 14 242.08 582.67 Tm +[(, Da)20 (vid Detlefs)]TJ +11.2 0 0 11.2 326.95 588.27 Tm +(*)Tj +14 0 0 14 332.55 582.67 Tm +[(, Ale)15 (x Garthw)10 (aite)]TJ +11.2 0 0 11.2 432.12 588.27 Tm +(*)Tj +14 0 0 14 437.72 582.67 Tm +(,)Tj +-19.425 -1.2857 TD +(Ross Knippel)Tj +11.2 0 0 11.2 241.61 570.27 Tm +(+)Tj +14 0 0 14 247.93 564.67 Tm +[(, Y)129 (.S. Ramakrishna)]TJ +11.2 0 0 11.2 355.39 570.27 Tm +(+)Tj +14 0 0 14 361.71 564.67 Tm +(, Derek White)Tj +11.2 0 0 11.2 440.63 570.27 Tm +(*)Tj +/N91 1 Tf +16 0 0 16 72 413.33 Tm +[(1 Intr)18 (oduction)]TJ +/N92 1 Tf +12 0 0 12 72 390 Tm +[(Shared-memory)-394 (multi-processor)-393 (systems)-394 (ha)20 (v)15 (e)-394 (become)-394 (mainstream,)-394 (e)25 (v)15 (en)-394 (in)-394 (the)-394 (personal)-394 (com-)]TJ +0 -1.1667 TD +[(puter)-337 (mark)10 (et.)-337 (Simultaneously)65 (,)-337 (the)-338 (concurrent)-337 (object-oriented)-336 (Ja)20 (v)25 (a)0 (\252)-337 (programming)-337 (language)-337 ([2])]TJ +T* +[(has)-253 (e)16 (xperienced)-253 (e)16 (xplosi)25 (v)15 (e)-253 (gro)25 (wth.)-252 (As)-253 (a)-252 (result,)-252 (signi\336cant)-253 (ef)25 (fort)-253 (is)-252 (being)-252 (de)25 (v)20 (oted)-252 (to)-253 (implementing)]TJ +T* +[(both)-380 (the)-381 (sequential)-380 (and)-381 (parallel)-380 (features)-381 (of)-380 (this)-381 (language,)-381 (e.g.,)-380 ([3,)-381 (13,)-381 (14,)-380 (21,)-381 (25].)-381 (This)-380 (paper)]TJ +T* +[(focuses)-242 (on)-242 (the)-242 (latter)-242 (area,)-242 (proposing)-241 (a)-242 (n)0 (e)25 (w)-241 (implementation)-242 (of)-242 (synchronization)-241 (operations)-242 (that,)-242 (we)]TJ +T* +[(belie)25 (v)15 (e)0 (,)-305 (possesses)-306 (an)-306 (attracti)25 (v)15 (e)-306 (set)-306 (of)-305 (trade-of)25 (fs)-305 (between)-305 (space,)-306 (time,)-306 (and)-305 (assumptions)-306 (about)-306 (the)]TJ +T* +[(underlying)-331 (hardw)10 (are.)-331 (Implementors)-331 (of)-331 (the)-331 (Ja)20 (v)25 (a)-331 (language\325)55 (s)-331 (synchronization)-331 (operations)-331 (are)-331 (chal-)]TJ +T* +[(lenged on tw)10 (o fronts:)]TJ +/N95 1 Tf +10 0 0 10 72 272 Tm +( \245)Tj +/N94 1 Tf +12 0 0 12 85.75 272 Tm +[(F)55 (r)37 (equency)55 (.)]TJ +/N92 1 Tf +4.7092 0 TD +[(Most)-330 (Ja)20 (v)25 (a)-330 (programs)-330 (synchronize)-329 (e)15 (xtremely)-330 (frequently)65 (.)-331 (This)-330 (occurs)-330 (because)-330 (stan-)]TJ +-4.7092 -1.1667 TD +[(dard)-227 (class)-227 (libraries,)-227 (including)-227 (commonly)-227 (used)-227 (data)-227 (types)-227 (such)-227 (as)-227 (v)15 (ectors)-227 (and)-227 (b)20 (u)0 (f)25 (fers,)-227 (ha)20 (v)15 (e)-227 (been)]TJ +T* +[(designed)-333 (for)-333 (the)-333 (general)-334 (multi-threaded)-333 (case.)-333 (T)80 (o)-333 (gi)26 (v)15 (e)-334 (just)-333 (one)-333 (e)15 (xample,)-333 (we)-333 (measured)-333 (on)-333 (the)]TJ +T* +[(SPECjvm98)-343 (v)15 (ersion)-343 (of)-343 (ja)20 (v)25 (a)0 (c)-344 ([26],)-343 (a)-343 (source-to-bytecode)-342 (compiler)40 (,)-343 (and)-343 (found)-343 (that)-343 (on)-342 (a)-344 (high-)]TJ +T* +[(performance)-360 (virtual)-361 (machine,)-359 (EVM)]TJ +9.6 0 0 9.6 261.01 220.8 Tm +(1)Tj +12 0 0 12 265.81 216 Tm +[(,)-360 (this)-360 (program)-360 (e)15 (x)15 (ecutes)-360 (765,000)-360 (synchronization)-360 (opera-)]TJ +-15.005 -1.1667 TD +(tions per second.)Tj +/N95 1 Tf +10 0 0 10 72 182 Tm +( \245)Tj +/N94 1 Tf +12 0 0 12 85.75 182 Tm +[(Ubiquity)56 (.)]TJ +/N92 1 Tf +4.0467 0 TD +[(The)-352 (Ja)21 (v)25 (a)-352 (language,)-352 (unlik)10 (e)-352 (most)-352 (concurrent)-351 (languages,)-352 (does)-352 (not)-352 (de\336ne)-352 (a)-352 (particular)]TJ +-4.0467 -1.1667 TD +[(type)-270 (of)-270 (synchronizable)-271 (object)-270 (such)-270 (as)-270 (a)-271 (monitor)-270 (or)-270 (a)-270 (mute)15 (x.)-271 (Instead,)-270 (an)15 (y)-270 (object,)-271 (including,)-270 (for)]TJ +T* +[(e)15 (xample,)-323 (strings)-324 (and)-323 (arrays,)-324 (can)-324 (be)-324 (synchronized)-323 (upon.)-324 (This)-324 (design)-324 (of)26 (fers)-323 (the)-324 (programmer)-323 (a)]TJ +T* +[(simpler)-247 (and)-246 (more)-246 (re)15 (gular)-246 (language,)-246 (b)20 (u)0 (t)-246 (presents)-246 (an)-246 (obstacle)-246 (to)-246 (the)-246 (language)-246 (implementor:)-247 (syn-)]TJ +T* +[(chronization)-247 (must)-248 (be)-248 (implemented)-248 (at)-248 (a)-248 (l)0 (o)25 (w)-247 (per)20 (-object)-248 (space)-247 (cost.)-248 (More)-247 (precisely)65 (,)-248 (the)]TJ +/N94 1 Tf +34.2725 0 TD +(ability)Tj +/N92 1 Tf +2.8033 0 TD +(to)Tj +ET +0 G +2 J +0 j +0.36 w +3.86 M +[]0 d +204 98 m +72 98 l +S +BT +10 0 0 10 72 83.33 Tm +[(1.)-699 (EVM,)-350 (the)-350 (Ja)20 (v)25 (a)-350 (virtual)-349 (machine)-350 (kno)25 (wn)-350 (pre)25 (viously)-350 (as)-349 (ExactVM,)-350 (is)-350 (embedded)-349 (in)-350 (Sun\325)55 (s)-350 (J)0 (a)20 (v)25 (a)-349 (2)-350 (SDK)]TJ +/N94 1 Tf +42.345 0 TD +[(Pr)45 (oduction)]TJ +/N92 1 Tf +-42.345 -1.2 TD +[(Release, a)20 (v)25 (ailable at http://www)65 (.sun.com/solaris/ja)20 (v)25 (a/.)]TJ +11.2 0 0 11.2 118.23 536.77 Tm +(*)Tj +14 0 0 14 126.07 531.17 Tm +(Sun Microsystems Laboratories)Tj +0 -1.2143 TD +[(One Netw)10 (ork Dri)25 (v)15 (e)]TJ +T* +(Burlington, MA 01803-0902)Tj +11.2 0 0 11.2 330.9 536.77 Tm +(+)Tj +14 0 0 14 340.57 531.17 Tm +(Sun Microsystems, Inc.)Tj +T* +(901 San Antonio Road)Tj +T* +[(P)15 (alo Alto, CA 94303-4900)]TJ +ET +PDFVars/TermAll get exec end end +userdict /pgsave get restore +showpage +%%PageTrailer +%%EndPage +%%Page: 5 5 +%%BeginPageSetup +userdict /pgsave save put +PDFVars begin PDF begin PDFVars/InitAll get exec +%%EndPageSetup +0 0 612 792 RC +1 g +/N93 /ExtGState findRes gs +1 i +0 0 612 792 re +f +BT +/N92 1 Tf +12 0 0 12 303 34.17 Tm +0 g +0 Tc +0 Tw +(2)Tj +-18.1042 56.4858 TD +[(synchronize)-325 (must)-326 (be)-326 (pro)16 (vided)-325 (at)-326 (a)-325 (l)0 (o)25 (w)-325 (space)-326 (cost;)]TJ +/N94 1 Tf +20.9425 0 TD +(actual)Tj +/N92 1 Tf +2.8258 0 TD +[(synchronization)-325 (can)-326 (use)-325 (additional)]TJ +-23.7683 -1.1667 TD +[(space)-376 (since,)-377 (in)-376 (practice,)-376 (programs)-376 (synchronize)-376 (on)-377 (a)-376 (small)-376 (fraction)-377 (of)-376 (objects.)-376 (F)15 (o)0 (r)-376 (e)15 (xample,)]TJ +T* +[(ja)20 (v)25 (ac, discussed abo)15 (v)15 (e, synchronizes on about 6% of allocated objects.)]TJ +-1.1458 -1.75 TD +[(Frequenc)15 (y)-244 (demands)-244 (time-ef)26 (\336cienc)14 (y)-243 (while)-245 (ubiquity)-243 (demands)-244 (space-ef)25 (\336cienc)15 (y)65 (.)-244 (This)-244 (paper)-244 (presents)]TJ +0 -1.1667 TD +[(a)-289 (n)0 (e)25 (w)-290 (synchronization)-289 (scheme)-289 (that,)-290 (we)-289 (belie)25 (v)15 (e)0 (,)-289 (attains)-290 (good)-289 (all-around)-290 (performance:)-290 (synchroni-)]TJ +T* +[(zation)-295 (e)15 (x)15 (ecutes)-295 (at)-295 (close)-295 (to)-295 (the)-295 (speed)-295 (of)-295 (the)-295 (hardw)10 (are)-295 (operations)-295 (while)-295 (reserving)-295 (only)-295 (tw)10 (o)-295 (bits)-295 (in)]TJ +T* +[(each)-296 (object.)-297 (Our)-296 (algorithm,)-297 (in)-296 (its)-297 (basic)-296 (form,)-297 (uses)-296 (a)-296 (t)0 (w)10 (o-le)25 (v)15 (e)0 (l)-296 (\(\322meta\323\))-297 (locking)-296 (scheme,)-297 (with)-296 (an)]TJ +T* +[(optional e)16 (xtension that fuses the tw)10 (o le)25 (v)15 (els for higher performance in uncontended cases.)]TJ +0 -1.75 TD +[(W)80 (e)-276 (assume)-276 (the)-276 (arbitrary)-276 (interlea)21 (ving)-276 (implied)-276 (by)-276 (preempti)25 (v)15 (e)-276 (thread)-276 (scheduling:)-276 (no)-276 (other)-276 (assump-)]TJ +0 -1.1667 TD +[(tion)-336 (mak)10 (es)-337 (sense)-336 (on)-337 (multiprocessors,)-337 (and,)-337 (moreo)15 (v)15 (e)0 (r)40 (,)-337 (e)25 (v)15 (en)-337 (on)-337 (uniprocessors)-337 (lack)-336 (of)-337 (preemption)]TJ +T* +[(can)-344 (lead)-344 (to)-344 (unf)10 (air)-344 (\(and)-344 (unintuiti)25 (v)15 (e)0 (\))-345 (thread)-344 (scheduling.)-343 (It)-344 (may)-345 (seem)-344 (that)-344 (synchronization)-343 (opera-)]TJ +T* +[(tions)-226 (could)-225 (be)-225 (elided)-225 (for)-226 (man)15 (y)-225 (programs)-225 (with)-225 (only)-226 (a)-225 (single)-225 (thread.)-226 (In)-225 (reality)65 (,)-226 (h)0 (o)25 (w)0 (e)25 (v)15 (er)40 (,)-226 (n)0 (o)-225 (program)]TJ +T* +[(written)-333 (in)-334 (the)-333 (Ja)21 (v)25 (a)-333 (language)-332 (is)-334 (single-threaded,)-332 (since,)-333 (in)-333 (addition)-333 (to)-334 (the)-333 (main)-332 (thread)-333 (created)-333 (by)]TJ +T* +[(user)-373 (code,)-374 (the)-374 (class)-373 (libraries)-374 (create)-373 (special)-374 (threads)-374 (to)-373 (handle)-374 (\336nalization)-373 (and)-373 (v)25 (arious)-373 (forms)-373 (of)]TJ +T* +[(weak)-275 (references)-274 ([27].)-275 (More)-274 (signi\336cantly)65 (,)-275 (perhaps,)-275 (commonly)-274 (used)-274 (graphics)-274 (libraries,)-274 (such)-274 (as)-275 (the)]TJ +T* +[(Abstract W)41 (indo)25 (ws T)80 (oolkit \(A)90 (WT\), create threads.)]TJ +0 -1.75 TD +[(The)-279 (rest)-278 (of)-279 (this)-279 (paper)-279 (is)-279 (or)18 (ganized)-279 (as)-279 (follo)26 (ws.)-279 (Section)-279 (2)-279 (informally)-279 (describes)-279 (the)-279 (Ja)21 (v)25 (a)-279 (language\325)56 (s)]TJ +0 -1.1667 TD +[(synchronization)-357 (operations,)-357 (at)-358 (both)-357 (the)-357 (source)-357 (and)-358 (bytecode)-356 (le)25 (v)15 (els.)-357 (Section)-357 (3)-357 (r)0 (e)25 (vie)25 (ws)-357 (pre)25 (vious)]TJ +T* +[(w)10 (ork)-338 (most)-337 (closely)-337 (related)-337 (to)-338 (our)-337 (synchronization)-337 (algorithm.)-337 (Section)-337 (4)-337 (describes)-337 (our)-338 (basic)-337 (tw)10 (o-)]TJ +T* +[(le)25 (v)15 (e)0 (l)-228 (algorithm,)-227 (and)-228 (Section)-228 (5)-227 (g)0 (i)25 (v)15 (es)-228 (an)-228 (informal)-227 (correctness)-228 (proof.)-227 (Section)-228 (6)-228 (discusses)-228 (e)16 (xtensions,)]TJ +T* +[(including)-268 (a)-269 (f)10 (ast)-268 (path)-269 (that)-268 (fuses)-268 (the)-269 (primary)-268 (and)-268 (meta-lock)-268 (le)25 (v)15 (els,)-268 (and)-268 (other)-268 (optimizations)-268 (that)-269 (are)]TJ +T* +[(important)-269 (for)-269 (good)-269 (performance.)-269 (Section)-269 (7)-270 (presents)-269 (performance)-269 (data)-269 (to)-269 (quantify)-269 (the)-269 (beha)20 (vior)-269 (of)]TJ +T* +[(our algorithm. Section 8 of)26 (fers \336nal conclusions and some directions for further w)12 (ork.)]TJ +/N91 1 Tf +16 0 0 16 72 348.33 Tm +[(2 Backgr)18 (ound)]TJ +/N92 1 Tf +12 0 0 12 72 325 Tm +[(Ja)20 (v)25 (a)-254 (virtual)-254 (machines)-254 (\(JVMs\))-254 (do)-254 (not)-254 (e)16 (x)15 (ecute)-255 (source)-254 (code)-254 (directly)65 (.)-254 (Instead,)-253 (the)15 (y)-254 (e)15 (x)15 (ecute)-254 (bytecode)]TJ +T* +[(obtained)-232 (from)-233 (binary)-233 (class-\336les)-232 (that)-233 (are)-232 (produced)-233 (by)-232 (a)-233 (source-to-bytecode)-232 (compiler)-233 (such)-233 (as)-233 (ja)20 (v)25 (ac.)]TJ +T* +[(Thus,)-292 (JVMs)-292 (must)-292 (implement)-292 (the)-292 (bytecode-le)25 (v)15 (e)0 (l)-291 (synchronization)-292 (operations,)-291 (not)-292 (the)-292 (source-le)25 (v)15 (e)0 (l)]TJ +T* +[(synchronization)-384 (operations.)-385 (The)-385 (distinction)-384 (is)-385 (important)-384 (because)-385 (the)-385 (bytecode-le)25 (v)15 (e)0 (l)-385 (operations)]TJ +T* +[(are more general than the source-le)25 (v)15 (el operations.)]TJ +/N91 1 Tf +14 0 0 14 72 236.67 Tm +[(2.1 Sour)18 (ce-le)15 (v)10 (el synchr)17 (onization)]TJ +/N92 1 Tf +12 0 0 12 72 212 Tm +[(The)-304 (Ja)20 (v)25 (a)-304 (language)-304 (pro)15 (vides)-304 (mutual)-304 (e)16 (xclusion)-304 (in)-304 (tw)10 (o)-304 (syntactic)-304 (forms.)-304 (A)]TJ +/N94 1 Tf +29.3975 0 TD +[(sync)15 (hr)45 (onized)-304 (method)]TJ +/N92 1 Tf +8.7692 0 TD +(of)Tj +-38.1667 -1.1667 TD +[(an)-251 (object)-251 (obtains)-250 (a)-251 (lock)-251 (on)-251 (the)-250 (object,)-251 (e)15 (x)15 (ecutes)-251 (the)-250 (method,)-251 (and)-251 (releases)-251 (the)-251 (lock.)-250 (A)]TJ +/N94 1 Tf +33.7825 0 TD +[(sync)15 (hr)45 (onized)]TJ +-33.7825 -1.1667 TD +(statement)Tj +/N92 1 Tf +3.8333 0 TD +(,)Tj +/N95 1 Tf +0.9967 0 TD +[(synchronize)-600 (\(exp\))-600 ({)-600 (...actions...)-600 (})]TJ +/N92 1 Tf +21 0 TD +[(,)-747 (e)25 (v)25 (aluates)-746 (the)-747 (e)15 (xpression)-746 (to)]TJ +-25.83 -1.1667 TD +[(obtain an object that is lock)11 (ed for the duration of the speci\336ed actions.)]TJ +0 -1.75 TD +[(The)-246 (synchronization)-246 (operations)-246 (in)-246 (the)-245 (Ja)20 (v)25 (a)-246 (language)-246 (are)-246 (reentrant)-246 (\(recursi)25 (v)15 (e\):)-246 (synchronized)-246 (state-)]TJ +0 -1.1667 TD +[(ments)-320 (on)-320 (the)-320 (same)-320 (object)-320 (can)-320 (nest,)-320 (synchronized)-319 (methods)-320 (can)-320 (in)41 (v)20 (o)0 (k)10 (e)-320 (other)-320 (synchronized)-319 (meth-)]TJ +T* +[(ods,)-260 (and)-259 (the)-260 (tw)10 (o)-260 (can)-259 (be)-260 (mix)15 (ed.)-260 (Consequently)65 (,)-260 (the)-260 (underlying)-260 (lock)-260 (and)-260 (unlock)-260 (operations)-260 (must)-260 (do)]TJ +T* +[(some)-298 (form)-298 (of)-299 (counting.)-297 (Both)-298 (synchronized)-298 (methods)-298 (and)-298 (statements)-298 (are)-298 (\322block)-298 (structured,)70 (\323)-298 (forc-)]TJ +T* +[(ing)-324 (perfect)-324 (nesting)-324 (of)-324 (locking)-324 (operations.)-324 (At)-323 (the)-325 (source)-323 (le)25 (v)15 (el,)-324 (there)-323 (is)-324 (no)-324 (w)10 (a)0 (y)-323 (t)0 (o)-324 (e)15 (xpress)-324 (unbal-)]TJ +ET +PDFVars/TermAll get exec end end +userdict /pgsave get restore +showpage +%%PageTrailer +%%EndPage +%%Page: 6 6 +%%BeginPageSetup +userdict /pgsave save put +PDFVars begin PDF begin PDFVars/InitAll get exec +%%EndPageSetup +0 0 612 792 RC +1 g +/N93 /ExtGState findRes gs +1 i +0 0 612 792 re +f +BT +/N92 1 Tf +12 0 0 12 303 34.17 Tm +0 g +0 Tc +0 Tw +(3)Tj +-19.25 56.4858 TD +[(anced)-344 (locking)-343 (operations.)-343 (In)-344 (particular)40 (,)-343 (e)15 (xception-thro)26 (wing)-343 (and)-344 (returning)-343 (out)-344 (of)-344 (lock)10 (ed)-343 (re)15 (gions)]TJ +0 -1.1667 TD +[(unlocks as necessary)66 (. Ho)25 (we)25 (v)15 (e)0 (r)40 (,)0 ( as we shall see belo)25 (w)65 (,)0 ( the story is dif)25 (ferent at the bytecode le)25 (v)15 (el.)]TJ +0 -1.75 TD +[(T)80 (o)-410 (f)10 (acilitate)-410 (communication)-410 (between)-410 (threads,)-409 (the)-410 (Ja)20 (v)25 (a)-410 (language)-410 (de\336nes)]TJ +/N95 1 Tf +30.3008 0 TD +(wait)Tj +/N92 1 Tf +2.4 0 TD +(,)Tj +/N95 1 Tf +0.66 0 TD +(notify)Tj +/N92 1 Tf +3.535 0 TD +[(,)-410 (and)]TJ +/N95 1 Tf +-36.8958 -1.1667 TD +(notifyAll)Tj +/N92 1 Tf +5.6608 0 TD +[(operations.)-260 (Lik)10 (e)-260 (locking,)-261 (these)-261 (operations)-260 (are)-261 (performed)-260 (relati)25 (v)15 (e)-261 (to)-261 (an)-260 (object.)-261 (Prior)]TJ +-5.6608 -1.1667 TD +[(to)-415 (e)15 (x)15 (ecuting)]TJ +/N95 1 Tf +5.4667 0 TD +(wait)Tj +/N92 1 Tf +2.8158 0 TD +(and)Tj +/N95 1 Tf +1.8592 0 TD +(notify)Tj +/N92 1 Tf +3.6 0 TD +([)Tj +/N95 1 Tf +0.3333 0 TD +(All)Tj +/N92 1 Tf +1.8 0 TD +[(],)-415 (a)-415 (thread)-416 (must)-415 (\336rst)-416 (lock)-415 (the)-415 (tar)18 (get)-415 (object.)-416 (Informally)66 (,)]TJ +/N95 1 Tf +-15.875 -1.1667 TD +(wait)Tj +/N92 1 Tf +2.6775 0 TD +[(fully)-277 (releases)-277 (the)-277 (lock)-277 (and)-278 (suspends)-277 (the)-277 (current)-277 (thread.)-277 (This)-278 (allo)25 (ws)-277 (other)-277 (threads)-278 (to)-277 (obtain)]TJ +-2.6775 -1.1667 TD +[(the)-448 (lock.)-448 (The)-448 (w)10 (aiting)-449 (thread)-448 (becomes)-448 (eligible)-447 (to)-449 (run)-448 (ag)6 (ain)-449 (when)-447 (another)-448 (thread)-448 (performs)-447 (a)]TJ +/N95 1 Tf +T* +(notify)Tj +/N92 1 Tf +3.9542 0 TD +[(operation)-354 (on)-354 (the)-355 (object,)-354 (a)-354 (speci\336ed)-354 (time)-354 (has)-354 (elapsed,)-354 (or)-354 (an)-353 (asynchronous)-355 (interrupt)-354 (is)]TJ +-3.9542 -1.1667 TD +[(deli)25 (v)15 (ered)-275 (to)-275 (the)-275 (thread.)-275 (The)]TJ +/N95 1 Tf +11.36 0 TD +(notify)Tj +/N92 1 Tf +3.875 0 TD +[(operation)-274 (w)9 (a)0 (k)10 (e)0 (s)-274 (one)-275 (w)10 (aiting)-275 (thread,)-275 (whereas)]TJ +/N95 1 Tf +18.365 0 TD +(notifyAll)Tj +/N92 1 Tf +-33.6 -1.1667 TD +[(w)10 (a)0 (k)10 (e)0 (s)-230 (all)-229 (w)9 (aiting)-229 (threads)-230 (\(when)-229 (no)-230 (threads)-230 (are)-230 (w)10 (aiting,)-230 (both)-229 (operations)-230 (are)-230 (no-ops\).)-230 (When)-229 (a)-230 (w)10 (ait-)]TJ +T* +[(ing)-252 (thread)-253 (w)10 (a)0 (k)10 (e)0 (s)-252 (up,)-252 (it)-252 (reacquires)-252 (the)-252 (lock)-252 (the)-252 (same)-252 (number)-251 (of)-253 (times)-252 (it)-252 (held)-252 (it)-252 (prior)-252 (to)]TJ +/N95 1 Tf +34.5425 0 TD +(wait)Tj +/N92 1 Tf +2.4 0 TD +[(.)-253 (The)]TJ +-36.9425 -1.1667 TD +[(lock)-342 (reacquisition)-342 (puts)-342 (the)-342 (thread)-342 (into)-342 (competition)-342 (with)-342 (other)-342 (threads)-342 (attempting)-342 (to)-342 (acquire)-343 (the)]TJ +T* +[(lock,)-365 (including)-366 (both)-365 (other)-366 (a)15 (w)10 (ak)10 (ened)-365 (w)9 (aiters)-366 (and)-365 (threads)-366 (attempting)-365 (to)-365 (e)15 (x)15 (ecute)-366 (a)-366 (synchronized)]TJ +T* +[(method)-346 (or)-345 (statement.)-346 (Once)-346 (a)-345 (w)10 (aiting)-345 (thread)-346 (has)-345 (reacquired)-346 (the)-345 (lock,)-346 (the)]TJ +/N95 1 Tf +30.0767 0 TD +(wait)Tj +/N92 1 Tf +2.7458 0 TD +[(operation)-345 (com-)]TJ +-32.8225 -1.1667 TD +[(pletes.)-239 (T)80 (o)-240 (simplify)-239 (matters,)-239 (in)-239 (this)-240 (paper)-239 (we)-240 (shall)-240 (not)-239 (discuss)-240 (interrupts)-239 (further)40 (,)-240 (e)16 (xcept)-240 (to)-239 (note)-240 (that)]TJ +T* +(in most of our implementation, an interrupt is handled similarly to a ti\ +me-out.)Tj +/N91 1 Tf +14 0 0 14 72 462.67 Tm +[(2.2 Bytecode-le)15 (v)10 (el synchr)18 (onization)]TJ +/N92 1 Tf +12 0 0 12 72 438 Tm +[(Ha)20 (ving)-269 (described)-268 (synchronization)-267 (at)-268 (the)-268 (source)-268 (le)25 (v)15 (el,)-268 (we)-268 (no)25 (w)-268 (turn)-268 (to)-268 (the)-268 (bytecode)-268 (le)25 (v)15 (el.)-269 (In)-268 (byte-)]TJ +T* +[(code,)-693 (method)-692 (synchronization)-692 (is)-693 (performed)-692 (as)-693 (part)-692 (of)-693 (the)-693 (call/return)-692 (sequence:)-693 (if)-693 (the)]TJ +/N95 1 Tf +T* +(acc_synchronized)Tj +/N92 1 Tf +9.8625 0 TD +[(attrib)20 (ute)-263 (is)-262 (set)-262 (on)-262 (a)-263 (method,)-262 (the)-262 (call)-263 (sequence)-262 (must)-262 (acquire)-263 (the)-262 (lock)-263 (\(one)]TJ +-9.8625 -1.1667 TD +[(more)-286 (time\))-286 (and)-286 (the)-286 (return)-285 (sequence)-286 (must)-286 (release)-286 (it)-286 (once.)-286 (Statement)-286 (synchronization)-286 (is)-285 (e)15 (xpressed)]TJ +T* +[(using)-368 (a)-368 (bytecode)-368 (pair)40 (,)]TJ +/N95 1 Tf +9.4592 0 TD +(monitorenter)Tj +/N92 1 Tf +7.5683 0 TD +(and)Tj +/N95 1 Tf +1.8125 0 TD +(monitorexit)Tj +/N92 1 Tf +6.6 0 TD +[(,)-368 (which)-368 (lock)-368 (and)-368 (unlock,)-369 (respec-)]TJ +-25.44 -1.1667 TD +[(ti)25 (v)15 (ely)65 (,)-322 (the)-323 (object)-322 (referenced)-322 (by)-322 (a)-323 (v)25 (alue)-322 (popped)-322 (from)-323 (the)-322 (JVM\325)55 (s)-322 (\322operand)-322 (stack.)70 (\323)-323 (Unfortunately)65 (,)]TJ +T* +[(while)-290 (the)-290 (bytecode)-290 (representation)-289 (of)-290 (synchronized)-289 (methods)-290 (is)-290 (inherently)-290 (well-nested,)-290 (there)-290 (is)-290 (no)]TJ +T* +[(such)-266 (guarantee)-266 (for)]TJ +/N95 1 Tf +7.685 0 TD +(monitorenter)Tj +/N92 1 Tf +7.4658 0 TD +(and)Tj +/N95 1 Tf +1.7108 0 TD +(monitorexit)Tj +/N92 1 Tf +6.6 0 TD +[(.)-266 (Nothing)-267 (pre)25 (v)15 (ents)-266 (bytecode)-266 (from)-267 (con-)]TJ +-23.4617 -1.1667 TD +[(taining)-290 (instruction)-290 (sequences)-291 (lik)10 (e)-290 (\322lock)]TJ +/N94 1 Tf +16.2183 0 TD +(A)Tj +/N92 1 Tf +0.6108 0 TD +[(,)-291 (lock)]TJ +/N94 1 Tf +2.5533 0 TD +(B)Tj +/N92 1 Tf +0.6108 0 TD +[(,)-290 (unlock)]TJ +/N94 1 Tf +3.5525 0 TD +(A)Tj +/N92 1 Tf +0.6117 0 TD +[(,)-290 (unlock)]TJ +/N94 1 Tf +3.5525 0 TD +(B)Tj +/N92 1 Tf +0.6108 0 TD +[(,)70 (\323)-291 (which)-290 (has)-290 (no)-291 (equi)25 (v)25 (alent)]TJ +-28.3208 -1.1667 TD +[(Ja)20 (v)25 (a)-264 (source)-264 (code.)-264 (The)-264 (loss)-264 (of)-264 (block)-264 (structure)-263 (at)-265 (the)-264 (bytecode)-264 (le)25 (v)15 (e)0 (l)-264 (mak)10 (es)-264 (it)-264 (dif)25 (\336cult)-264 (to)-264 (stack)-263 (allo-)]TJ +T* +[(cate)-366 (locking-related)-365 (data)-365 (structures.)-365 (Consequently)64 (,)-365 (t)0 (o)-365 (handle)-366 (non-LIFO)-365 (locking)-366 (and)-365 (unlocking,)]TJ +T* +(our implementation uses a free-list allocator \(see Section 6.1\).)Tj +0 -1.75 TD +[(Concei)25 (v)25 (ably)65 (,)-319 (a)-319 (static)-320 (analysis)-319 (of)-319 (bytecode)-319 (could)-320 (conserv)25 (ati)25 (v)15 (ely)-319 (\322pair)-320 (up\323)]TJ +/N95 1 Tf +30.0367 0 TD +(monitorenter)Tj +/N92 1 Tf +7.5192 0 TD +(and)Tj +/N95 1 Tf +-37.5558 -1.1667 TD +(monitorexit)Tj +/N92 1 Tf +7.0375 0 TD +[(instructions)-437 (in)-438 (most)-437 (cases,)-437 (allo)25 (wing)-437 (subsequent)-437 (e)15 (x)15 (ecution)-437 (to)-438 (assume)-438 (perfect)]TJ +-7.0375 -1.1667 TD +[(nesting.)-335 (W)80 (e)-335 (w)10 (ould)-335 (e)16 (xpect)-336 (this)-335 (analysis)-334 (to)-335 (succeed)-335 (most)-335 (of)-335 (the)-335 (time)-335 (on)-335 (bytecode)-335 (resulting)-335 (from)]TJ +T* +[(translation)-274 (of)-274 (Ja)20 (v)25 (a)-274 (source)-274 (code.)-274 (In)-274 (general,)-274 (ho)26 (we)24 (v)15 (e)0 (r)40 (,)-274 (the)-275 (problem)-274 (is)-275 (undecidable,)-274 (and)-274 (it)-274 (w)10 (ould)-274 (be)]TJ +T* +[(incorrect)-257 (to)-257 (reject)-258 (bytecode)-257 (for)-257 (which)-258 (the)-258 (analysis)-257 (f)10 (ails,)-257 (because)-258 (such)-257 (bytecode)-257 (is)-258 (still)-257 (le)15 (gal)-258 (i)0 (n)-258 (the)]TJ +T* +[(sense)-371 (that)-372 (it)-371 (passes)-371 (the)-371 (bytecode)-371 (v)15 (eri\336er)-371 ([18].)-371 (T)80 (o)-372 (further)-371 (complicate)-371 (the)-370 (picture,)-371 (JNI,)-371 (the)-371 (Ja)21 (v)25 (a)]TJ +T* +[(Nati)25 (v)15 (e)-390 (Interf)10 (ace,)-390 (grants)-390 (nati)25 (v)15 (e)-389 (code)-390 (access)-390 (to)-389 (synchronization)-390 (in)-390 (the)-390 (form)-390 (of)-389 (lock)-390 (and)-390 (unlock)]TJ +T* +[(operations,)-314 (so)-313 (e)26 (v)15 (en)-314 (if)-313 (all)-313 (bytecode)-314 (can)-314 (be)-313 (sho)25 (wn)-313 (to)-314 (be)-313 (structured,)-313 (a)-313 (f)10 (all-back)-314 (mechanism)-313 (w)10 (ould)]TJ +T* +[(still be needed for synchronization by nati)26 (v)15 (e)0 ( code.)]TJ +0 -1.75 TD +[(Finally)65 (,)-361 (the)]TJ +/N95 1 Tf +4.9633 0 TD +(wait)Tj +/N92 1 Tf +2.7608 0 TD +(and)Tj +/N95 1 Tf +1.8058 0 TD +(notify)Tj +/N92 1 Tf +3.6 0 TD +([)Tj +/N95 1 Tf +0.3325 0 TD +(All)Tj +/N92 1 Tf +1.8 0 TD +[(])-361 (operations)-361 (ha)20 (v)15 (e)-361 (no)-361 (direct)-361 (representation)-361 (at)-361 (the)-361 (bytecode)]TJ +-15.2625 -1.1667 TD +[(le)25 (v)15 (el,)-1103 (b)20 (u)0 (t)-1103 (instead)-1103 (are)-1103 (implemented)-1102 (as)-1103 (nati)24 (v)16 (e)-1103 (methods)-1102 (of)-1103 (the)-1103 (top-most)-1102 (class)]TJ +T* +(\()Tj +/N95 1 Tf +0.3333 0 TD +(java.lang.Object)Tj +/N92 1 Tf +9.6 0 TD +(\).)Tj +ET +PDFVars/TermAll get exec end end +userdict /pgsave get restore +showpage +%%PageTrailer +%%EndPage +%%Page: 7 7 +%%BeginPageSetup +userdict /pgsave save put +PDFVars begin PDF begin PDFVars/InitAll get exec +%%EndPageSetup +0 0 612 792 RC +1 g +/N93 /ExtGState findRes gs +1 i +0 0 612 792 re +f +BT +/N92 1 Tf +12 0 0 12 303 34.17 Tm +0 g +0 Tc +0 Tw +(4)Tj +/N91 1 Tf +16 0 0 16 72 709.33 Tm +[(3 Related w)10 (ork)]TJ +/N92 1 Tf +12 0 0 12 72 686 Tm +[(There)-309 (is)-308 (a)-309 (lar)18 (ge)-308 (general)-309 (literature)-308 (on)-309 (synchronization)-307 (primiti)25 (v)15 (e)0 (s)-309 (and)-308 (their)-309 (implementation.)-308 (Dijk-)]TJ +0 -1.1667 TD +[(stra)-278 ([9])-277 (and)-278 (Lamport)-277 ([16])-278 (present)-279 (subtle)-278 (algorithms)-278 (that)-278 (achie)25 (v)15 (e)-278 (mutual)-278 (e)15 (xclusion)-277 (assuming)-278 (only)]TJ +T* +[(that)-289 (indi)25 (vidual)-290 (reads)-289 (and)-288 (writes)-289 (are)-290 (atomic.)-289 (F)15 (ortunately)65 (,)-290 (modern)-289 (architectures)-289 (pro)16 (vide)-290 (composite)]TJ +T* +[(instructions)-287 (such)-287 (as)-286 (compare-and-sw)10 (ap)-287 (that)-287 (read)-286 (and)-287 (write)-287 (a)-287 (memory)-287 (location)-286 (in)-287 (a)-288 (single)-287 (atomic)]TJ +T* +[(step,)-241 (greatly)-241 (simplifying)-241 (the)-241 (mutual)-241 (e)15 (xclusion)-242 (problem)-241 (and)-241 (eliminating)-241 (the)-241 (need)-241 (for)-241 (such)-242 (subtlety)65 (.)]TJ +T* +[(W)80 (e)-276 (shall)-277 (refer)-277 (to)-276 (the)-276 (composite)-277 (atomic)-276 (instructions)-276 (simply)-277 (as)-276 (\322atomic)-276 (instructions.)70 (\323)-277 (W)81 (eakly)-277 (con-)]TJ +T* +[(sistent)-245 (memory)-243 (models,)-244 (which)-245 (allo)25 (w)-244 (dif)24 (ferent)-244 (processors)-245 (to)-244 (observ)15 (e)-244 (memory)-244 (operations)-244 (as)-244 (occur-)]TJ +T* +[(ring)-501 (in)-500 (dif)25 (ferent)-500 (orders,)-500 (may)-500 (require)-500 (the)-500 (use)-500 (of)-500 (memory)-500 (barrier)-500 (instructions)-500 (to)-500 (reintroduce)]TJ +T* +[(consistenc)15 (y)65 (,)0 ( whether using indi)25 (vidual reads and writes or atomic instructions.)]TJ +0 -1.75 TD +[(In)-247 (its)-247 (broad)-246 (structure,)-247 (our)-247 (meta-lock)-246 (resembles)-247 (the)-247 (MCS-lock)-246 (of)-247 (Mellor)20 (-Crumme)15 (y)-247 (and)-247 (Scott)-246 ([20].)]TJ +0 -1.1667 TD +[(The)-261 (MCS-lock)-261 (uses)-261 (an)-261 (atomic)-261 (sw)10 (ap)-261 (for)-261 (lock)-261 (acquisition)-261 (and)-261 (an)-261 (atomic)-261 (compare-and-sw)10 (ap)-261 (\()]TJ +/N95 1 Tf +36.8667 0 TD +(CAS)Tj +/N92 1 Tf +1.8 0 TD +(\))Tj +-38.6667 -1.1667 TD +[(for)-251 (lock)-250 (release)-251 (in)-250 (much)-250 (the)-250 (same)-250 (w)10 (a)0 (y)-250 (a)0 (s)-250 (does)-250 (our)-249 (meta-lock)-251 (algorithm.)-250 (The)-250 (MCS-lock)-250 (has)-250 (man)15 (y)]TJ +T* +[(of)-377 (the)-377 (same)-377 (properties)-377 (as)-377 (our)-377 (meta-lock,)-377 (including)-377 (starv)25 (ation)-377 (freedom)-377 (and)-378 (FIFO)-377 (access)-377 (to)-378 (the)]TJ +T* +[(lock.)-245 (Ho)25 (we)25 (v)15 (e)0 (r)40 (,)-244 (the)-245 (details)-245 (are)-244 (quite)-245 (dif)25 (ferent)-245 (and,)-244 (in)-244 (particular)40 (,)-244 (contention)-244 (results)-244 (in)-245 (b)20 (usy-w)11 (aiting.)]TJ +T* +[(Also,)-285 (space-ef)25 (\336cienc)15 (y)-285 (i)0 (s)-285 (not)-284 (as)-285 (e)15 (xtreme)-285 (a)-285 (concern)-285 (in)-284 (the)-285 (conte)15 (xt)-285 (of)-284 (their)-285 (w)10 (ork.)-285 (Magnusson)]TJ +/N94 1 Tf +36.965 0 TD +[(et)-285 (al.)]TJ +/N92 1 Tf +-36.965 -1.1667 TD +[([19])-324 (surv)15 (e)15 (y)-324 (locking)-323 (schemes)-323 (related)-324 (to)-324 (MCS-locks,)-324 (and)-324 (describe)-324 (tw)10 (o)-323 (n)0 (e)25 (w)-324 (locking)-324 (schemes)-323 (that)]TJ +T* +[(of)25 (fer dif)25 (ferent performance trade-of)26 (fs.)]TJ +0 -1.75 TD +[(Brinch)-249 (Hansen)-248 ([11])-248 (and)-248 (Hoare)-249 ([12])-249 (coined)-248 (the)-249 (term)]TJ +/N94 1 Tf +20.8983 0 TD +(monitor)Tj +/N92 1 Tf +3.1267 0 TD +[(,)-249 (and)-248 (pro)16 (vided)-249 (the)-248 (nomenclature)-248 (used)]TJ +-24.025 -1.1667 TD +[(by)-226 (the)-226 (Ja)21 (v)25 (a)-226 (language.)-226 (There)-225 (are)-226 (se)26 (v)15 (eral)-226 (w)9 (ays,)-225 (ho)25 (we)24 (v)15 (e)0 (r)40 (,)-226 (in)-226 (which)-226 (the)-226 (monitors)-225 (in)-226 (the)-225 (Ja)20 (v)25 (a)-226 (language)]TJ +T* +[(dif)25 (fer)-355 (from)-354 (the)-354 (original:)-354 (an)15 (y)-355 (object)-354 (may)-354 (be)-355 (used)-354 (as)-354 (a)-355 (monitor)41 (,)-354 (monitors)-355 (may)-354 (be)-355 (entered)-354 (recur-)]TJ +T* +[(si)25 (v)15 (ely)65 (,)-382 (and)-383 (monitors)-383 (pro)15 (vide)-382 (a)-384 (single)-383 (implicit,)-383 (rather)-382 (than)-383 (possibly)-382 (se)25 (v)15 (eral)-383 (e)15 (xplicit,)-383 (\322condition)]TJ +T* +[(v)25 (ariables.)70 (\323)-379 (Birrell)-380 (gi)26 (v)15 (e)0 (s)-379 (a)0 (n)-380 (e)16 (xcellent)-380 (tutorial)-379 (on)-380 (programming)-379 (with)-379 (\322standard\323)-379 (synchronization)]TJ +T* +[(primiti)25 (v)15 (e)0 (s)-235 ([5].)-236 (As)-236 (we)-236 (ha)20 (v)15 (e)-236 (discussed,)-235 (the)-236 (synchronization)-235 (primiti)25 (v)15 (e)0 (s)-236 (o)0 (f)-235 (the)-236 (Ja)21 (v)25 (a)-236 (platform)-236 (are)-236 (dif)25 (\336-)]TJ +T* +[(cult)-262 (to)-262 (implement)-262 (in)-262 (a)-262 (manner)-262 (that)-262 (is)-261 (both)-262 (time-)-262 (and)-262 (space-ef)25 (\336cient.)-262 (Scalability)-262 (to)-262 (multiprocessor)]TJ +T* +[(systems)-444 (is)-444 (another)-444 (important)-443 (concern.)-444 (W)80 (e)-444 (shall)-444 (no)25 (w)-444 (discuss)-443 (some)-444 (pre)25 (vious)-444 (implementations)]TJ +T* +[(focusing on the approach the)15 (y tak)10 (e to trading of)25 (f these concerns.)]TJ +0 -1.75 TD +[(The)-342 (original)-343 (JDK\252)-342 (implementation)-342 (of)-343 (the)-343 (Ja)21 (v)25 (a)-343 (virtual)-343 (machine)-342 ([18])-343 (pro)16 (vides)-343 (a)-343 (space-ef)25 (\336cient)]TJ +0 -1.1667 TD +[(monitor)-374 (implementation,)-373 (b)20 (u)0 (t)-374 (one)-374 (that)-374 (is)-375 (not)-374 (particularly)-374 (time-ef)25 (\336cient)-374 (or)-374 (scalable.)-374 (This)-374 (design)]TJ +T* +[(requires)-260 (that)-260 (each)-260 (object)-260 (has)-260 (a)-260 (unique)-260 (identi\336er)-260 (v)25 (alid)-260 (o)14 (v)16 (er)-260 (its)-260 (lifetime.)-260 (The)-260 (actual)-260 (implementation)]TJ +T* +[(uses)-336 (an)]TJ +/N94 1 Tf +3.3392 0 TD +[(object)-337 (table)]TJ +/N92 1 Tf +5.1175 0 TD +[(whose)-337 (entries)-336 (are)-336 (called)]TJ +/N94 1 Tf +10.1758 0 TD +(handles)Tj +/N92 1 Tf +3.1108 0 TD +[(,)-337 (pro)15 (viding)-336 (an)-337 (e)15 (xtra)-337 (le)25 (v)15 (e)0 (l)-336 (o)0 (f)-337 (indirection)-336 (to)]TJ +-21.7433 -1.1667 TD +[(f)10 (acilitate)-251 (object)-252 (compaction.)-251 (The)-251 (handle)-251 (address)-251 (of)-251 (an)-252 (object)-251 (remains)-251 (constant)-251 (during)-251 (the)-251 (object\325)55 (s)]TJ +T* +[(lifetime,)-335 (and)-336 (can)-336 (therefore)-336 (serv)15 (e)-336 (a)0 (s)-335 (a)-336 (unique)-335 (identi\336er)55 (.)-336 (A)-335 (global)-336 (table)-336 (called)-336 (the)]TJ +/N94 1 Tf +33.18 0 TD +[(monitor)-336 (cac)16 (he)]TJ +/N92 1 Tf +-33.18 -1.1667 TD +[(maps)-341 (object)-341 (handle)-341 (addresses)-341 (to)-341 (monitor)-341 (structures)-341 (that)-342 (can)-341 (be)-341 (used)-341 (to)-341 (perform)-341 (the)-341 (actual)-341 (syn-)]TJ +T* +[(chronization)-242 (operations.)-242 (When)-243 (a)-242 (thread)-243 (synchronizes)-242 (on)-242 (an)-243 (object,)-243 (it)-242 (\336rst)-242 (ensures)-243 (that)-242 (the)-243 (monitor)]TJ +T* +[(cache)-309 (maps)-309 (the)-309 (object)-309 (to)-309 (a)-309 (monitor)41 (,)-309 (creating)-309 (and)-309 (installing)-308 (the)-309 (monitor)-309 (in)-309 (the)-309 (table)-309 (if)-309 (necessary)66 (.)]TJ +T* +[(This)-275 (approach)-275 (has)-274 (no)-275 (\336x)15 (ed)-275 (per)20 (-object)-275 (space)-275 (cost,)-275 (using)-275 (only)-274 (space)-276 (proportional)-274 (to)-275 (the)-275 (number)-275 (of)]TJ +T* +[(entries)-399 (in)-399 (the)-399 (monitor)-399 (cache.)-399 (Ho)25 (we)25 (v)15 (e)0 (r)40 (,)-398 (it)-399 (is)-399 (f)10 (airly)-399 (time-inef)25 (\336cient,)-399 (since)-398 (each)-399 (synchronization)]TJ +T* +[(operation)-311 (must)-311 (\336rst)-311 (do)-311 (\(at)-312 (least\))-311 (a)-311 (table)-311 (lookup)-311 (to)-311 (locate)-310 (the)-311 (monitor)-311 (associated)-311 (with)-311 (the)-310 (object.)]TJ +T* +[(Further)40 (,)-287 (i)0 (t)-288 (i)0 (s)-288 (not)-288 (v)15 (ery)-288 (scalable.)-288 (The)-288 (monitor)-288 (cache)-288 (is)-288 (a)-288 (global)-288 (data)-287 (structure)-288 (that)-287 (is)-288 (accessed)-288 (con-)]TJ +T* +[(currently)65 (.)-260 (T)79 (o)-260 (mak)10 (e)-260 (this)-261 (concurrent)-260 (access)-260 (safe,)-261 (the)-260 (monitor)-260 (cache)-260 (is)-260 (protected)-260 (by)-260 (a)-260 (lock.)-260 (Thus,)-260 (all)]TJ +T* +[(synchronization)-367 (operations)-367 (obtain)-367 (a)-368 (single)-367 (lock,)-368 (an)-367 (ob)15 (vious)-367 (source)-367 (of)-368 (contention.)-367 (The)-367 (monitor)]TJ +T* +(cache locking also adds some cost in the uncontended case.)Tj +ET +PDFVars/TermAll get exec end end +userdict /pgsave get restore +showpage +%%PageTrailer +%%EndPage +%%Page: 8 8 +%%BeginPageSetup +userdict /pgsave save put +PDFVars begin PDF begin PDFVars/InitAll get exec +%%EndPageSetup +0 0 612 792 RC +1 g +/N93 /ExtGState findRes gs +1 i +0 0 612 792 re +f +BT +/N92 1 Tf +12 0 0 12 303 34.17 Tm +0 g +0 Tc +0 Tw +(5)Tj +-19.25 56.4858 TD +(Bacon)Tj +/N94 1 Tf +2.8117 0 TD +[(et)-255 (al.)]TJ +/N92 1 Tf +2.2625 0 TD +[([3])-256 (propose)-256 (a)-256 (cle)24 (v)15 (e)0 (r)-256 (scheme)-256 (moti)25 (v)25 (ated)-256 (by)-257 (man)15 (y)-257 (o)0 (f)-256 (the)-256 (same)-256 (concerns)-257 (as)-256 (ours.)-256 (In)-256 (this)]TJ +-5.0742 -1.1667 TD +[(design,)-274 (24)-274 (bits)-274 (of)-274 (each)-274 (object)-274 (header)-274 (are)-274 (de)25 (v)20 (oted)-274 (to)-274 (locking.)-273 (One)-275 (bit)-274 (indicates)-274 (whether)-274 (the)-274 (object)]TJ +T* +[(has)-295 (a)]TJ +/N94 1 Tf +2.3683 0 TD +(thin)Tj +/N92 1 Tf +1.8517 0 TD +(or)Tj +/N94 1 Tf +1.1283 0 TD +(fat)Tj +/N92 1 Tf +1.3517 0 TD +[(lock.)-295 (If)-295 (it)-296 (has)-295 (a)-296 (thin)-296 (lock,)-295 (then)-295 (all)-295 (necessary)-295 (locking)-296 (information)-295 (is)-295 (contained)-295 (in)]TJ +-6.7 -1.1667 TD +[(the)-276 (remaining)-276 (23)-277 (bits.)-277 (If)-276 (it)-276 (has)-277 (a)-277 (f)10 (at)-277 (lock,)-276 (then)-276 (the)-277 (remaining)-276 (23)-277 (bits)-277 (are)-276 (an)-276 (inde)15 (x)-277 (into)-276 (an)-277 (array)-276 (of)]TJ +T* +[(pointers)-257 (to)-257 (f)10 (a)0 (t)-257 (lock)-257 (structures)-258 (that,)-257 (much)-257 (lik)9 (e)-257 (monitors,)-257 (hold)-257 (the)-258 (necessary)-257 (data)-257 (for)-256 (the)-258 (synchroni-)]TJ +T* +[(zation)-305 (operations.)-305 (Thin)-305 (locks)-305 (are)-305 (used)-304 (as)-305 (long)-304 (as)-305 (the)-305 (lock)-305 (is)-305 (uncontended,)-305 (and)-305 (is)-305 (not)-304 (recursi)24 (v)15 (ely)]TJ +T* +[(lock)10 (ed)-294 (more)-294 (times)-294 (than)-295 (can)-294 (be)-294 (represented)-293 (in)-294 (a)-293 (count)-295 (\336eld)-294 (of)-294 (the)-294 (thin)-294 (lock;)-294 (if)-294 (either)-294 (condition)-294 (is)]TJ +T* +[(violated,)-340 (the)-340 (lock)-340 (representation)-340 (is)-340 (\322in\337ated.)70 (\323)-340 (This)-340 (design)-340 (does)-340 (well)-340 (at)-340 (optimizing)-341 (the)-340 (e)15 (xpected)]TJ +T* +[(common)-233 (case)-233 (of)-234 (uncontended)-233 (locking)-233 (and)-233 (unlocking.)-233 (Locking)-233 (requires)-233 (a)-234 (small)-233 (number)-233 (of)-234 (instruc-)]TJ +T* +[(tions,)-264 (with)-264 (only)-264 (one)-263 (atomic)-264 (instruction)-263 (in)-264 (the)-265 (f)10 (ast)-264 (path,)-264 (and)-263 (unlocking)-264 (requires)-264 (no)-263 (atomic)-264 (instruc-)]TJ +T* +[(tion. Ho)25 (we)25 (v)15 (e)0 (r)40 (,)0 ( thin locks lea)20 (v)15 (e)0 ( some issues to be addressed:)]TJ +/N95 1 Tf +10 0 0 10 72 552 Tm +( \245)Tj +/N94 1 Tf +12 0 0 12 85.75 552 Tm +(Contention.)Tj +/N92 1 Tf +4.9383 0 TD +[(When)-243 (there)-243 (is)-243 (contention)-244 (on)-243 (a)-243 (lock)-244 (for)-243 (the)-243 (\336rst)-244 (time,)-243 (all)-243 (threads)-244 (that)-243 (do)-243 (not)-243 (acquire)]TJ +-4.9383 -1.1667 TD +[(the)-240 (lock)-240 (must)]TJ +/N94 1 Tf +5.6075 0 TD +[(b)20 (usy-wait)]TJ +/N92 1 Tf +4.1083 0 TD +[(\(a.k.a.)-240 (spin\))-239 (until)-239 (the)-240 (lock)-239 (is)-240 (released.)-240 (Unbounded)-239 (b)20 (usy-w)10 (aiting)-239 (is)-240 (gen-)]TJ +-9.7158 -1.1667 TD +[(erally)-303 (undesirable)-303 (b)20 (ut,)-304 (as)-303 (the)-304 (authors)-303 (point)-303 (out,)-304 (b)20 (usy-w)10 (aiting)-303 (is)-304 (done)-303 (at)-304 (most)-302 (once)-304 (in)-303 (the)-304 (life-)]TJ +T* +[(time)-309 (of)-308 (a)-308 (g)0 (i)25 (v)15 (en)-308 (object.)-308 (Still,)-309 (it)-308 (w)10 (ould)-309 (not)-308 (be)-308 (hard)-308 (to)-309 (construct)-308 (a)-308 (program)-308 (that)-308 (does)-309 (contended)]TJ +T* +[(locking on man)15 (y short-li)26 (v)15 (ed objects, causing a great deal of b)21 (usy-w)10 (aiting.)]TJ +/N95 1 Tf +10 0 0 10 72 476 Tm +( \245)Tj +/N94 1 Tf +12 0 0 12 85.75 476 Tm +[(Lac)21 (k)-395 (o)0 (f)-395 (de\337ation.)]TJ +/N92 1 Tf +7.6367 0 TD +(Lock)Tj +/N94 1 Tf +2.45 0 TD +(de\337ation)Tj +/N92 1 Tf +3.8958 0 TD +[(is)-395 (not)-395 (discussed;)-395 (once)-395 (a)-395 (lock)-395 (becomes)-395 (f)10 (at,)-395 (it)-396 (remains)-395 (f)10 (at.)]TJ +-13.9825 -1.1667 TD +[(While)-245 (this)-246 (is)-245 (not)-245 (an)-245 (issue)-245 (for)-246 (most)-245 (programs,)-245 (one)-245 (could)-245 (imagine)-245 (a)-245 (long-li)25 (v)15 (e)0 (d)-245 (program)-245 (in)-245 (which)]TJ +T* +[(man)15 (y)-368 (long-li)25 (v)14 (e)0 (d)-368 (objects)-368 (are)-368 (lock)10 (ed)-368 (with)-368 (contention)-368 (at)-368 (some)-368 (point)-368 (in)-368 (their)-368 (lifetimes.)-367 (Absent)]TJ +T* +[(some)-381 (form)-380 (of)-380 (de\337ation,)-380 (such)-380 (a)-380 (program)-380 (w)10 (ould)-380 (consume)-381 (a)-380 (lar)18 (ge)-380 (amount)-381 (of)-380 (memory)-379 (for)-381 (f)10 (a)0 (t)]TJ +T* +(locks.)Tj +/N95 1 Tf +10 0 0 10 72 400 Tm +( \245)Tj +/N94 1 Tf +12 0 0 12 85.75 400 Tm +[(Space)-359 (consumption.)]TJ +/N92 1 Tf +8.4683 0 TD +[(Thin)-359 (locks)-360 (use)-359 (24)-360 (bits.)-359 (This)-359 (is)-360 (a)-359 (signi\336cant)-359 (o)15 (v)15 (erhead)-359 (since)-360 (the)-360 (a)20 (v)15 (erage)]TJ +-8.4683 -1.1667 TD +(object size for most programs is quite small [8].)Tj +-1.1458 -1.75 TD +[(In)-452 (other)-451 (related)-451 (w)9 (ork,)-451 (monitor)-452 (implementations)-451 (ha)21 (v)15 (e)-452 (been)-452 (proposed)-451 (that)-451 (e)15 (xploit)-452 (cooperati)25 (v)15 (e)]TJ +0 -1.1667 TD +[(thread)-247 (scheduling)-247 ([14])-247 (or)-246 (mak)10 (e)-247 (special)-247 (pro)16 (vision)-247 (for)-246 (f)10 (aster)-247 (e)16 (x)15 (ecution)-247 (of)-247 (single-threaded)-247 (programs)]TJ +T* +[([21,)-252 (24].)-252 (As)-251 (we)-252 (ha)20 (v)15 (e)-252 (already)-252 (noted,)-251 (we)-252 (assume)-252 (preempti)25 (v)15 (e)-252 (scheduling,)-252 (and)-252 (desire)-252 (algorithms)-252 (that)]TJ +T* +[(w)10 (ork)-309 (well)-309 (for)-309 (both)-309 (single-)-308 (and)-309 (multi-threaded)-308 (programs,)-308 (so)-309 (we)-308 (shall)-308 (not)-309 (discuss)-308 (these)-309 (restricted)]TJ +T* +[(schemes further)55 (.)]TJ +0 -1.75 TD +[(In)-242 ([4],)-241 (Bak)-242 (describes)-242 (ho)25 (w)-241 (a)0 (n)-242 (early)-242 (v)15 (ersion)-243 (of)-241 (Sun\325)55 (s)-242 (HotSpot\252)-242 (JVM)-242 (locks)-241 (objects)-242 (by)-242 (replacing)-241 (an)]TJ +0 -1.1667 TD +[(object)-252 (header)-252 (w)10 (ord)-253 (with)-252 (a)-252 (pointer)-252 (to)-252 (an)-253 (e)16 (xternal)-252 (lock)-252 (structure,)-252 (displacing)-251 (the)-252 (original)-252 (contents)-252 (of)]TJ +T* +[(the)-270 (header)-269 (w)10 (ord)-269 (into)-270 (the)-269 (lock)-270 (structure.)-269 (T)80 (w)10 (o)-270 (l)0 (o)26 (w-order)-270 (bits)-269 (in)-269 (the)-269 (header)-270 (w)10 (ord)-269 (encode)-269 (its)-270 (format.)]TJ +T* +[(HotSpot)-361 (stack)-361 (allocates)-361 (lock)-360 (structures)-360 (for)-361 (ef)25 (\336cienc)15 (y)65 (.)-361 (Furthermore,)-361 (this)-361 (stack)-361 (allocation)-361 (allo)25 (ws)]TJ +T* +[(f)10 (ast)-388 (recursi)25 (v)15 (e)-389 (locking)-389 (by)-388 (enabling)-388 (ef)25 (\336cient)-389 (v)15 (eri\336cation)-388 (that)-388 (an)-388 (object)-388 (is)-389 (lock)10 (ed)-388 (by)-388 (the)-389 (current)]TJ +T* +[(thread:)-305 (if)-306 (the)-305 (lock)-305 (structure)-306 (address)-306 (is)-306 (suf)25 (\336ciently)-306 (close)-306 (to)-305 (the)-306 (current)-305 (stack)-306 (pointer)-306 (to)-306 (guarantee)]TJ +T* +[(membership)-276 (in)-276 (the)-276 (same)-277 (thread)-276 (stack,)-277 (the)-276 (current)-276 (thread)-276 (must)-276 (be)-277 (the)-276 (lock)-276 (o)25 (wner)56 (.)-277 (I)0 (n)-276 (non-preemp-)]TJ +T* +[(ti)25 (v)15 (e)-293 (thread)-293 (systems,)-292 (locking)-292 (and)-292 (unlocking)-292 (can)-292 (be)-293 (implemented)-292 (straightforw)10 (ardly)65 (.)-292 (I)0 (n)-292 (preempti)25 (v)15 (e)]TJ +T* +[(systems,)-344 (a)-343 (more)-343 (complicated)-343 (test-and-set)-344 (protocol)-343 (on)-343 (one)-343 (of)-344 (the)-344 (bits)-344 (in)-344 (the)-343 (header)-344 (w)10 (ord)-344 (grants)]TJ +T* +[(e)15 (xclusi)25 (v)15 (e)0 ( access to at most one thread; other threads b)21 (usy-w)10 (ait for their turn.)]TJ +0 -1.75 TD +[(W)80 (e)-246 (borro)25 (w)-246 (HotSpot\325)55 (s)-246 (idea)-246 (of)-246 (displacing)-246 (a)-246 (header)-247 (w)10 (ord)-246 (and)-246 (using)-246 (some)-246 (bits)-247 (of)-246 (the)-246 (w)10 (ord)-246 (to)-247 (encode)]TJ +0 -1.1667 TD +[(its)-237 (format)-236 (and)-236 (the)-236 (lock)-236 (state.)-236 (W)80 (e)-236 (use)-236 (a)-236 (dif)24 (ferent)-235 (allocation)-237 (scheme)-236 (to)-236 (support)-237 (non-block-structured)]TJ +T* +[(synchronization and emplo)11 (y a b)20 (usy-w)10 (ait-free protocol to protect the header w)11 (ord.)]TJ +ET +PDFVars/TermAll get exec end end +userdict /pgsave get restore +showpage +%%PageTrailer +%%EndPage +%%Page: 9 9 +%%BeginPageSetup +userdict /pgsave save put +PDFVars begin PDF begin PDFVars/InitAll get exec +%%EndPageSetup +0 0 612 792 RC +1 g +/N93 /ExtGState findRes gs +1 i +0 0 612 792 re +f +BT +/N92 1 Tf +12 0 0 12 303 34.17 Tm +0 g +0 Tc +0 Tw +(6)Tj +/N91 1 Tf +16 0 0 16 72 709.33 Tm +(4 Our algorithm)Tj +/N92 1 Tf +12 0 0 12 72 686 Tm +[(In)-250 (an)15 (y)-251 (concurrent)-250 (en)40 (vironment,)-250 (there)-250 (must)-251 (be)-251 (some)-251 (protocol)-250 (observ)15 (ed)-251 (by)-251 (threads)-250 (as)-250 (the)15 (y)-250 (manipu-)]TJ +0 -1.1667 TD +[(late)-317 (the)]TJ +/N94 1 Tf +3.3 0 TD +[(sync)15 (hr)45 (onization)-317 (data)]TJ +/N92 1 Tf +8.7408 0 TD +[(of)-317 (objects,)-317 (that)-317 (is,)-317 (the)-316 (data)-317 (structures)-317 (that)-317 (manage)-317 (synchronization)]TJ +-12.0408 -1.1667 TD +[(operations.)-302 (T)80 (ypically)65 (,)-302 (the)-302 (protocol)-303 (speci\336es)-302 (when)-302 (a)-303 (thread)-303 (may)-302 (access)-302 (or)-303 (manipulate)-302 (an)-303 (object\325)55 (s)]TJ +T* +[(synchronization)-272 (data.)-273 (F)15 (o)0 (r)-273 (e)15 (xample,)-272 (the)-273 (thin-locks)-272 (approach)-272 (relies)-273 (on)-272 (an)-273 (in)41 (v)25 (ariant)-273 (whereby)-273 (only)-273 (a)]TJ +T* +[(thread)-492 (o)25 (wning)-492 (the)-491 (lock)-491 (on)-492 (an)-492 (object)-492 (may)-491 (modify)-492 (that)-492 (object\325)55 (s)-492 (synchronization)-491 (data.)-492 (Other)]TJ +T* +[(approaches,)-308 (lik)9 (e)-308 (ours,)-309 (allo)25 (w)-309 (a)0 (n)15 (y)-309 (thread)-308 (to)-309 (update)-309 (this)-308 (information.)-309 (The)-309 (k)10 (e)15 (y)-309 (t)0 (o)-309 (our)-309 (approach)-309 (is)-309 (a)]TJ +T* +[(time-)-296 (and)-297 (space-ef)25 (\336cient)-297 (meta-lock)-297 (associated)-296 (with)-297 (the)-296 (synchronization)-297 (data)-297 (of)-296 (each)-297 (object.)-297 (The)]TJ +T* +(typical pattern for synchronization operations in our system is:)Tj +1.1458 -1.6667 TD +[(1.)-228 (Obtain)-229 (the)-229 (object\325)55 (s)-229 (meta-lock)-228 (to)-229 (ensure)-228 (e)15 (xclusi)25 (v)15 (e)-229 (access)-228 (to)-229 (the)-229 (object\325)55 (s)-228 (synchronization)-228 (data.)]TJ +0 -1.4167 TD +[(2. Manipulate the synchronization data of that object. This operation sh\ +ould be f)11 (ast.)]TJ +T* +[(3. Release or hand of)25 (f the object\325)55 (s meta-lock.)]TJ +-1.1458 -1.75 TD +[(Meta-locks)-462 (play)-463 (a)-463 (similar)-463 (role)-463 (as)-463 (the)-463 (auxiliary)-462 (spin-locks)-463 (seen)-463 (in)-463 (man)15 (y)-462 (implementations)-463 (of)]TJ +0 -1.1667 TD +[(POSIX)-372 (threads)-373 ([6,)-372 (17].)-372 (These)-372 (spin-locks)-372 (pro)15 (vide)-372 (brief)-372 (e)15 (xclusi)26 (v)15 (e)-373 (access)-372 (to)-372 (the)-372 (synchronization)]TJ +T* +[(state maintained in records representing programmer)21 (-le)25 (v)15 (el mute)15 (x)15 (es and condition v)25 (ariables.)]TJ +0 -1.75 TD +[(W)80 (e)-300 (shall)-299 (use)-300 (the)-300 (term)-300 (\322monitor)20 (-lock\323)-300 (to)-300 (denote)-300 (the)-300 (lock)-300 (abstraction)-299 (e)15 (xported)-300 (by)-300 (the)-300 (Ja)20 (v)25 (a)-300 (virtual)]TJ +0 -1.1667 TD +[(machine)-358 (to)-358 (a)20 (v)20 (oid)-358 (confusion)-357 (with)-358 (the)-358 (meta-lock)-357 (used)-358 (in)-358 (its)-357 (implementation.)-357 (Because)-358 (meta-lock)]TJ +T* +[(acquisition)-353 (occurs)-354 (in)-354 (FIFO)-354 (order)40 (,)-353 (the)-354 (abo)15 (v)15 (e)-354 (pattern)-354 (allo)25 (ws)-354 (a)-354 (number)-354 (of)-353 (f)10 (airness)-354 (policies)-354 (at)-354 (the)]TJ +T* +[(monitor)-306 (le)25 (v)15 (el.)-306 (The)-307 (rest)-307 (of)-306 (this)-307 (section)-307 (describes)-306 (our)-306 (algorithm)-307 (in)-307 (detail:)-306 (Section)-307 (4.1)-307 (presents)-307 (the)]TJ +T* +[(data)-298 (structures)-298 (in)40 (v)20 (olv)15 (ed,)-298 (Section)-298 (4.2)-298 (the)-298 (meta-lock)-298 (algorithm,)-297 (Section)-298 (4.3)-298 (the)-298 (implementation)-297 (of)]TJ +T* +[(the)-377 (monitor)20 (-le)25 (v)15 (e)0 (l)-377 (lock)-377 (and)-378 (unlock)-378 (operations,)-377 (and)-377 (Section)-377 (4.4)-377 (the)-378 (implementation)-377 (of)-378 (w)10 (ait)-377 (and)]TJ +T* +[(notify)65 (.)]TJ +/N91 1 Tf +14 0 0 14 72 347.67 Tm +[(4.1 Data structur)18 (es)]TJ +/N92 1 Tf +12 0 0 12 72 323 Tm +[(Synchronization)-307 (in)40 (v)20 (olv)16 (es)-308 (three)-308 (entities:)-307 (threads,)-308 (objects,)-308 (and)-308 (lock)-307 (records.)-308 (W)80 (e)-308 (describe)-307 (the)-308 (rele-)]TJ +T* +[(v)25 (ant)-310 (parts)-311 (of)-310 (the)-310 (data)-311 (structures)-311 (that)-310 (implement)-311 (them)-311 (and)-310 (ho)25 (w)-311 (the)15 (y)-310 (interact)-311 (during)-311 (synchroniza-)]TJ +T* +(tion.)Tj +/N94 1 Tf +0 -1.75 TD +[(Thr)37 (eads)]TJ +/N92 1 Tf +3.2408 0 TD +[(.)-246 (W)80 (e)-246 (call)-246 (the)-245 (data)-246 (structure)-246 (that)-246 (holds)-245 (thread-speci\336c)-246 (state)-245 (an)]TJ +/N94 1 Tf +24.425 0 TD +[(e)20 (xecution)-246 (en)40 (vir)45 (onment)]TJ +/N92 1 Tf +9.2183 0 TD +(\()Tj +/N95 1 Tf +0.3325 0 TD +(EE)Tj +/N92 1 Tf +1.2 0 TD +(\).)Tj +-38.4167 -1.1667 TD +(Since)Tj +/N95 1 Tf +2.4683 0 TD +(EE)Tj +/N92 1 Tf +1.2 0 TD +[(s)-246 (and)-246 (threads)-246 (correspond)-246 (one)-246 (to)-246 (one,)]TJ +/N95 1 Tf +14.8025 0 TD +(EE)Tj +/N92 1 Tf +1.4467 0 TD +[(addresses)-245 (are)-246 (well-suited)-246 (as)-246 (unique)-246 (thread)-246 (iden-)]TJ +-19.9175 -1.1667 TD +[(ti\336ers.)-249 (Figure)-250 (1)-250 (sho)25 (ws)-250 (the)-250 (\336elds)-250 (in)]TJ +/N95 1 Tf +14.0292 0 TD +(EE)Tj +/N92 1 Tf +1.2 0 TD +[(s)-249 (that)-250 (the)-250 (synchronization)-249 (code)-249 (uses)-250 (as)-249 (well)-250 (as)-249 (an)-249 (initializa-)]TJ +-15.2292 -1.1667 TD +[(tion)-281 (function)-282 (that)-282 (sets)-282 (the)-282 (\336elds)-281 (to)-282 (their)-282 (steady-state)-281 (v)25 (alues.)-282 (The)-282 (mute)15 (x)16 (e)0 (s)-282 (and)-282 (condition)-282 (v)25 (ariables)]TJ +T* +[(in)-225 (the)]TJ +/N95 1 Tf +2.4508 0 TD +(EE)Tj +/N92 1 Tf +1.2 0 TD +[(s)-225 (are)-225 (used)-225 (to)-225 (a)20 (v)20 (oid)-225 (b)20 (usy-w)10 (aiting)-225 (when)-225 (contention)-225 (requires)-225 (threads)-225 (to)-224 (w)9 (ait)-225 (for)-226 (their)-225 (turn)-225 (to)]TJ +-3.6508 -1.1667 TD +[(lock)-227 (an)-227 (object,)-228 (while)-227 (other)-227 (\336elds)-227 (serv)15 (e)-228 (t)0 (o)-227 (e)15 (xchange)-227 (information)-227 (between)-227 (threads)-227 (synchronizing)-227 (on)]TJ +T* +(the same object.)Tj +/N94 1 Tf +0 -1.75 TD +(Objects)Tj +/N92 1 Tf +3.055 0 TD +[(.)-265 (Figure)-250 (2)-265 (sho)25 (ws)-265 (the)-265 (object)-265 (layout)-265 (in)-264 (EVM.)-265 (Because)-264 (e)25 (v)15 (ery)-265 (object)-265 (may)-265 (potentially)-265 (be)-264 (used)]TJ +-3.055 -1.1667 TD +[(for)-350 (synchronization,)-349 (it)-350 (is)-350 (critical)-350 (to)-350 (minimize)-349 (the)-350 (per)20 (-object)-350 (space)-350 (o)15 (v)15 (erhead.)-349 (Objects)-350 (ha)20 (v)15 (e)-349 (tw)10 (o-)]TJ +T* +[(w)10 (ord)-313 (headers.)-312 (The)-312 (\336rst)-312 (w)9 (ord)-312 (points)-312 (to)-313 (the)-312 (object\325)55 (s)-312 (class.)-313 (Only)-312 (the)-313 (second)-312 (w)9 (ord)-312 (is)-313 (used)-312 (for)-313 (syn-)]TJ +T* +[(chronization,)-284 (b)21 (u)0 (t)-284 (i)0 (t)-284 (serv)15 (es)-284 (other)-284 (purposes)-284 (as)-284 (well,)-284 (such)-284 (as)-284 (holding)-284 (a)-284 (hash)-284 (code)]TJ +9.6 0 0 9.6 450.48 131.8 Tm +(2)Tj +12 0 0 12 458.69 127 Tm +[(and)-283 (garbage-col-)]TJ +ET +0 G +2 J +0 j +0.36 w +3.86 M +[]0 d +204 86 m +72 86 l +S +BT +10 0 0 10 72 71.33 Tm +[(2. EVM uses a handle-less cop)9 (ying memory system, so object or handle addresses cannot be used as hash\ + codes.)]TJ +ET +PDFVars/TermAll get exec end end +userdict /pgsave get restore +showpage +%%PageTrailer +%%EndPage +%%Page: 10 10 +%%BeginPageSetup +userdict /pgsave save put +PDFVars begin PDF begin PDFVars/InitAll get exec +%%EndPageSetup +0 0 612 792 RC +1 g +/N93 /ExtGState findRes gs +1 i +0 0 612 792 re +f +BT +/N92 1 Tf +12 0 0 12 303 34.17 Tm +0 g +0 Tc +0 Tw +(7)Tj +-19.25 32.1942 TD +[(lector)-284 (age)-284 (information,)-283 (so)-284 (we)-283 (call)-283 (it)-284 (the)]TJ +/N94 1 Tf +16.1825 0 TD +(multi-use)Tj +/N92 1 Tf +4.0058 0 TD +[(w)10 (ord.)-284 (W)80 (e)-283 (emplo)10 (y)-284 (a)]TJ +/N94 1 Tf +8.1717 0 TD +[(header)-283 (wor)37 (d)-284 (displacement)]TJ +/N92 1 Tf +-28.36 -1.1667 TD +[(technique)-299 (in)41 (v)15 (ented)-299 (by)-299 (our)-299 (colleague)-299 (Lars)-299 (Bak)-299 ([4].)-299 (The)-299 (tw)10 (o)-299 (least)-299 (signi\336cant)-299 (bits)-299 (in)-299 (the)-299 (multi-use)]TJ +T* +[(w)10 (ord,)-389 (called)-389 (the)]TJ +/N94 1 Tf +7.0708 0 TD +[(loc)20 (k)-389 (bits)]TJ +/N92 1 Tf +3.48 0 TD +[(,)-389 (hold)-389 (the)]TJ +/N94 1 Tf +4.4167 0 TD +[(loc)20 (k)-389 (state)10 (,)]TJ +/N92 1 Tf +4.5525 0 TD +[(which)-388 (serv)15 (e)-389 (a)0 (s)-389 (a)-389 (format)-389 (indicator)-388 (and)-388 (simulta-)]TJ +-19.52 -1.1667 TD +[(neously)-255 (encode)-254 (meta-lock)-254 (information)-254 (for)-255 (the)-255 (object.)-254 (Figure)-250 (3)-254 (sho)25 (ws)-255 (the)-254 (four)-255 (possible)-254 (lock)-255 (states)]TJ +T* +[(and)-244 (their)-244 (formats.)-245 (Objects)-245 (are)-244 (created)-245 (in)-244 (the)]TJ +/N95 1 Tf +17.7017 0 TD +(NEUTRAL)Tj +/N92 1 Tf +4.445 0 TD +[(state)-244 (\(in)-245 (f)11 (act,)-245 (the)-245 (majority)-244 (of)-245 (objects)-244 (ne)25 (v)15 (e)0 (r)]TJ +-22.1467 -1.1667 TD +[(lea)20 (v)15 (e)-365 (this)-365 (state\),)-366 (remain)-365 (in)-365 (this)-365 (state)-365 (as)-364 (long)-365 (as)-365 (no)-365 (thread)-365 (synchronizes)-365 (on)-366 (them,)-365 (and)-365 (return)-364 (to)]TJ +/N95 1 Tf +T* +(NEUTRAL)Tj +/N92 1 Tf +4.5758 0 TD +[(once)-376 (synchronization)-376 (ceases.)-376 (A)-376 (monitor)21 (-lock)10 (ed)-376 (object)-377 (is)-375 (in)-376 (the)]TJ +/N95 1 Tf +26.4342 0 TD +(LOCKED)Tj +/N92 1 Tf +3.9758 0 TD +[(state.)-376 (The)]TJ +-34.9858 -1.1667 TD +[(high)-404 (30)-404 (bits)-404 (of)-404 (the)-405 (multi-use)-404 (w)10 (ord)-404 (hold)-404 (a)-404 (pointer)-404 (to)-404 (synchronization)-404 (data)-405 (\(a)]TJ +/N94 1 Tf +32.4258 0 TD +[(loc)20 (k)-404 (r)37 (ecor)37 (d)]TJ +/N92 1 Tf +4.6425 0 TD +[(,)-404 (see)]TJ +-37.0683 -1.1667 TD +[(belo)25 (w\))-292 (that)-291 (indicates)-291 (which)-292 (thread)-291 (o)25 (wns)-291 (the)-291 (monitor)20 (-lock)-291 (and)-291 (also)-291 (stores)-291 (the)-290 (displaced)-291 (hash)-291 (and)]TJ +T* +[(age)-348 (information.)-348 (The)]TJ +/N95 1 Tf +8.9583 0 TD +(WAITERS)Tj +/N92 1 Tf +4.5483 0 TD +[(state)-348 (is)-347 (entered)-349 (when)-347 (a)-348 (thread)-348 (releases)-348 (the)-348 (monitor)21 (-lock)-348 (while)]TJ +-13.5067 -1.1667 TD +[(other)-348 (threads)-349 (are)-348 (w)10 (aiting)-348 (to)-349 (acquire)-349 (the)-348 (lock)-349 (or)-349 (to)-349 (be)-348 (noti\336ed:)-347 (the)-349 (object)-348 (is)-349 (no)-348 (longer)-348 (monitor)21 (-)]TJ +T* +[(lock)10 (ed,)-320 (b)20 (u)0 (t)-319 (the)-319 (state)-319 (must)-320 (be)-319 (distinguished)-319 (from)]TJ +/N95 1 Tf +19.9417 0 TD +(NEUTRAL)Tj +/N92 1 Tf +4.52 0 TD +[(because)-319 (the)-320 (remainder)-319 (of)-319 (the)-320 (multi-)]TJ +-24.4617 -1.1667 TD +[(use)-299 (w)10 (ord)-299 (still)-300 (points)-299 (to)-299 (synchronization)-299 (data.)-300 (The)-299 (fourth)-299 (and)-299 (\336nal)-299 (state,)]TJ +/N95 1 Tf +29.3033 0 TD +(BUSY)Tj +/N92 1 Tf +2.2708 0 TD +[(,)-300 (indicates)-299 (that)-300 (the)]TJ +-31.5742 -1.1667 TD +[(object)-365 (is)-366 (meta-lock)11 (ed.)-366 (In)-365 (this)-366 (case,)-365 (the)-365 (high)-366 (part)-365 (of)-366 (the)-365 (multi-use)-366 (w)10 (ord)-365 (contains)-366 (the)]TJ +/N95 1 Tf +35.0142 0 TD +(EE)Tj +/N92 1 Tf +1.565 0 TD +[(of)-366 (the)]TJ +-36.5792 -1.1667 TD +[(thread)-253 (that)-252 (has)-254 (the)-253 (meta-lock)]TJ +/N94 1 Tf +11.8192 0 TD +(or)Tj +/N92 1 Tf +1.1417 0 TD +(the)Tj +/N95 1 Tf +1.475 0 TD +(EE)Tj +/N92 1 Tf +1.4533 0 TD +[(of)-254 (a)-253 (thread)-253 (attempting)-253 (to)-254 (acquire)-253 (the)-253 (meta-lock,)-253 (as)-253 (will)-253 (be)]TJ +-15.8892 -1.1667 TD +[(e)15 (xplained)-227 (in)-227 (Section)-227 (4.2.)-227 (A)-227 (minimum)-227 (of)-228 (tw)10 (o)-228 (lock)-227 (bits)-227 (are)-227 (required)-227 (since)-227 (we)-227 (must)-227 (distinguish)-227 (three)]TJ +T* +[(states:)-329 (is)-329 (the)-329 (object)-329 (meta-lock)10 (ed)-328 (or)-329 (not,)-329 (and,)-329 (if)-329 (not,)-329 (whether)-329 (the)-329 (remaining)-329 (bits)-329 (of)-329 (the)-329 (multi-use)]TJ +T* +[(w)10 (ord)-247 (contain)-246 (their)-246 (original)-246 (contents)-246 (or)-246 (a)-246 (pointer)-246 (to)-246 (a)-247 (data)-246 (structure)-247 (into)-246 (which)-247 (those)-246 (contents)-246 (ha)20 (v)15 (e)]TJ +T* +[(been)-304 (displaced.)-304 (Ho)25 (we)25 (v)15 (e)0 (r)40 (,)-304 (while)-304 (the)-305 (minimum)-304 (is)-305 (three)-303 (states,)-305 (ef)25 (\336cienc)15 (y)-304 (f)10 (a)20 (v)20 (ors)-304 (use)-304 (of)-304 (all)-303 (four)-305 (bit)]TJ +T* +[(patterns,)-347 (allo)25 (wing)-347 (the)]TJ +/N95 1 Tf +9.1558 0 TD +(LOCKED)Tj +/N92 1 Tf +3.9475 0 TD +(and)Tj +/N95 1 Tf +1.7917 0 TD +(WAITERS)Tj +/N92 1 Tf +4.5475 0 TD +[(states)-347 (to)-348 (be)-348 (distinguished)-347 (in)-347 (the)-348 (lock)-348 (bits)-347 (rather)]TJ +-19.4425 -1.1667 TD +(than by a bit in the synchronization data.)Tj +/N94 1 Tf +0 -1.75 TD +[(Loc)20 (k)-382 (r)36 (ecor)37 (ds.)]TJ +/N92 1 Tf +5.9192 0 TD +[(Most)-381 (synchronization)-382 (data)-382 (is)-382 (k)10 (ept)-382 (not)-382 (in)-382 (objects)-382 (b)20 (u)0 (t)-382 (i)0 (n)]TJ +/N95 1 Tf +23.2342 0 TD +(LockRecord)Tj +/N92 1 Tf +6 0 TD +[(s.)-382 (A)-382 (lock)]TJ +-35.1533 -1.1667 TD +[(record)-411 (represents)-411 (a)-411 (thread)-411 (for)-411 (the)-411 (purpose)-410 (of)-411 (synchronization)-411 (on)-411 (a)-411 (particular)-411 (object.)-411 (Figure)-250 (4)]TJ +T* +[(sho)25 (ws)-258 (the)-259 (\336elds)-258 (of)-259 (a)-258 (lock)-258 (record:)-258 (the)-259 (o)25 (wner)-258 (thread,)-258 (the)-258 (number)-258 (of)-259 (times)-258 (the)-259 (thread)-258 (has)-258 (lock)10 (ed)-258 (the)]TJ +T* +[(object)-247 (\(recall)-247 (that)-247 (monitor)21 (-locks)-247 (are)-247 (recursi)25 (v)15 (e\),)-247 (a)-247 (\336eld)-247 (for)-247 (the)-247 (displaced)-246 (hash)-247 (and)-248 (age)-247 (information,)]TJ +4.2008 33.375 TD +[(Figure 1. Per)21 (-thread \(e)15 (x)15 (ecution en)40 (vironment\) \336elds used for synchronization)]TJ +9.66 -1 TD +[(and their steady-state v)25 (alues)]TJ +/N95 1 Tf +8 0 0 8 126 710.17 Tm +(typedef struct execenv {)Tj +2.4 -1.25 TD +[(Thread)-4200 (thread;)-9600 (/* ExecEnv is a subtype of Thread. */)]TJ +T* +[(mutex_t)-3600 (metaLockMutex;)-5400 (/* Used by slow-path meta-lock/unlock. */)]TJ +T* +[(condvar_t)-2400 (metaLockCondvar;)-4200 (/* To wait for meta-lock hand-off.)-3000 (*/)]TJ +T* +[(bool_t)-4200 (gotMetaLockSlow;)-4200 (/* Wait for predecessor to give bits.)-1200 (*/)]TJ +T* +[(bool_t)-4200 (bitsForGrab;)-6600 (/* Wait for successor to grab bits.)-2400 (*/)]TJ +T* +[(BitField)-3000 (metaLockBits;)-6000 (/* Space to get/give releaseBits.)-3600 (*/)]TJ +T* +[(ExecEnv)-3000 (*succEE;)-9600 (/* Next thread to get the meta-lock.)-1800 (*/)]TJ +T* +[(mutex_t)-3600 (monitorLockMutex;)-3600 (/* Used by slow-path lock/unlock.)-3000 ( */)]TJ +T* +[(condvar_t)-2400 (monitorLockCondvar;)-2400 (/* To wait for monitor acquisition.)-2400 (*/)]TJ +T* +[(bool_t)-4200 (isWaitingForNotify;)-2400 (/* Am waiting for notification.)-4800 (*/)]TJ +T* +[(...)-600 (other fields ...)]TJ +-2.4 -1.25 TD +(} ExecEnv;)Tj +0 -2.5 TD +(void initializeEE\(ExecEnv *ee\) {)Tj +2.4 -1.25 TD +[(ee->gotMetaLockSlow)-2400 (= FALSE;)]TJ +-2.4 -1.25 TD +[( ee->bitsForGrab)-4800 (= FALSE;)]TJ +2.4 -1.25 TD +(ee->isWaitingForNotify = FALSE;)Tj +T* +[(ee->succEE)-7800 (= NULL;)]TJ +T* +(... initialize other fields ...)Tj +-2.4 -1.25 TD +(})Tj +ET +PDFVars/TermAll get exec end end +userdict /pgsave get restore +showpage +%%PageTrailer +%%EndPage +%%Page: 11 11 +%%BeginPageSetup +userdict /pgsave save put +PDFVars begin PDF begin PDFVars/InitAll get exec +%%EndPageSetup +0 0 612 792 RC +1 g +/N93 /ExtGState findRes gs +1 i +0 0 612 792 re +f +BT +/N92 1 Tf +12 0 0 12 303 34.17 Tm +0 g +0 Tc +0 Tw +(8)Tj +-19.25 27.2358 TD +[(a)-291 (queue)-291 (\336eld)-291 (for)-291 (linking)-291 (the)-291 (lock)-291 (records)-291 (of)-291 (all)-292 (threads)-291 (that)-291 (synchronize)-291 (on)-291 (a)-292 (g)0 (i)26 (v)15 (en)-292 (object,)-291 (and)-291 (a)]TJ +0 -1.1667 TD +[(free-list)-346 (\336eld)-347 (for)-346 (linking)-347 (lock)-347 (records)-347 (when)-346 (the)15 (y)-348 (are)-346 (not)-347 (in)-346 (use)-347 (\(see)-347 (Section)-347 (6.1\).)-346 (Figure)-251 (4)-347 (also)]TJ +T* +[(sho)25 (ws)-289 (an)-289 (object)-289 (with)-289 (three)-289 (lock)-290 (records)-289 (on)-289 (its)-289 (lock)-289 (queue.)-289 (In)-289 (this)-290 (e)16 (xample,)-289 (the)-290 (state)-289 (is)]TJ +/N95 1 Tf +35.15 0 TD +(LOCKED)Tj +/N92 1 Tf +3.6 0 TD +(,)Tj +-38.75 -1.1667 TD +[(so)-255 (one)-255 (of)-254 (the)-255 (lock)-255 (records)-255 (belongs)-255 (to)-254 (a)-255 (thread)-255 (that)-255 (holds)-255 (the)-255 (monitor)21 (-lock.)-255 (In)-255 (our)-254 (implementation,)]TJ +T* +[(ne)25 (w)-253 (lock)-254 (records)-254 (are)-253 (appended)-254 (to)-254 (the)-253 (end)-253 (of)-254 (the)-254 (queue)-253 (\(FIFO)-254 (order\))-253 (and)-253 (stay)-254 (in)-254 (order)40 (,)-253 (e)15 (xcept)-253 (that)]TJ +T* +[(when)-261 (a)-261 (thread)-261 (acquires)-260 (the)-260 (monitor)20 (-lock,)-261 (it)-261 (mo)16 (v)15 (e)0 (s)-261 (its)-261 (lock)-260 (record)-261 (to)-260 (the)-261 (front)-260 (so)-261 (that)-261 (the)-260 (\336rst)-261 (lock)]TJ +T* +[(record of a lock)10 (ed object al)10 (w)10 (ays belongs to the thread that holds the monitor)21 (-lock.)]TJ +0 -1.75 TD +[(An)-360 (object\325)55 (s)-359 (meta-lock)-359 (protects)-359 (its)-359 (synchronization)-359 (data,)-360 (which)-359 (we)-359 (can)-359 (no)25 (w)-359 (de\336ne)-359 (precisely)-359 (as)]TJ +0 -1.1667 TD +[(comprising)-262 (the)-262 (multi-use)-262 (w)10 (ord,)-262 (including)-262 (the)-262 (lock)-262 (queue)-262 (pointer)-263 (\(when)-262 (there)-262 (is)-262 (one\))-262 (and)-263 (the)-262 (lock)]TJ +T* +[(records)-230 (in)-231 (that)-231 (queue.)-230 (F)14 (o)0 (r)-230 (e)15 (xample,)-231 (if)-231 (a)-230 (thread)-231 (w)9 (ants)-231 (to)-230 (place)-231 (a)-231 (lock)-230 (record)-231 (in)-230 (the)-231 (queue)-230 (to)-231 (w)10 (ait)-231 (for)]TJ +T* +[(its)-252 (turn)-252 (to)-251 (acquire)-252 (the)-252 (monitor)21 (-lock,)-252 (it)-251 (meta-locks)-252 (the)-252 (object)-252 (to)-252 (gain)-251 (e)15 (xclusi)25 (v)15 (e)-252 (access)-252 (to)-252 (the)-251 (queue,)]TJ +T* +[(appends)-263 (its)-263 (lock)-263 (record,)-263 (and)-263 (then)-263 (releases)-263 (the)-263 (meta-lock.)-263 (Similarly)65 (,)-263 (t)0 (o)-263 (read)-263 (or)-264 (write)-263 (the)-263 (hash)-263 (code)]TJ +T* +[(of)-354 (an)-354 (object,)-353 (meta-locking)-354 (must)-354 (be)-353 (done)-354 (\(though)-353 (it)-354 (is)-354 (possible)-354 (to)-354 (optimize)-354 (reads)-353 (of)-354 (immutable)]TJ +T* +[(\336elds, lik)10 (e the hash code, most of the time\).)]TJ +/N91 1 Tf +14 0 0 14 72 139.67 Tm +[(4.2 Meta-locking: exclusi)10 (v)10 (e)0 ( access to synchr)18 (onization data)]TJ +/N92 1 Tf +12 0 0 12 72 115 Tm +[(Figure)-250 (5)-293 (sho)26 (ws)-293 (the)-293 (non-contention)-292 (\(f)10 (ast-path\))-293 (code)-293 (for)-292 (obtaining)-293 (and)-293 (releasing)-293 (an)-293 (object\325)55 (s)-293 (meta-)]TJ +T* +[(lock.)-322 (A)-323 (thread)-323 (attempts)-322 (to)-323 (gain)-322 (the)-323 (meta-lock)-322 (by)-322 (using)-322 (an)-323 (atomic)-322 (sw)10 (ap)-323 (operation)-322 (to)-323 (replace)-323 (the)]TJ +T* +[(object\325)55 (s)-238 (multi-use)-237 (w)10 (ord)-237 (with)-237 (a)-238 (w)10 (ord)-238 (consisting)-237 (of)-237 (a)-238 (reference)-237 (to)-238 (the)-237 (thread\325)55 (s)]TJ +/N95 1 Tf +30.5042 0 TD +(EE)Tj +/N92 1 Tf +1.4375 0 TD +[(and)-237 (the)-237 (lo)25 (w-order)]TJ +-31.9417 -1.1667 TD +[(bits)-257 (representing)-258 (the)]TJ +/N95 1 Tf +8.3833 0 TD +(BUSY)Tj +/N92 1 Tf +2.6575 0 TD +[(state.)-258 (If)-258 (the)-257 (w)9 (ord)-258 (returned)-257 (by)-258 (the)-257 (sw)9 (ap)-258 (operation)-257 (has)-258 (lo)25 (w-order)-257 (bits)-257 (in)]TJ +ET +0 G +2 J +0 j +0.6 w +3.86 M +[]0 d +295.26 712.67 72 -90 re +367.26 694.67 m +295.26 694.67 l +367.26 676.67 m +295.26 676.67 l +S +BT +10 0 0 10 300.26 701.17 Tm +(class ptr)Tj +0 -1.8 TD +[(multi-use w)10 (ord)]TJ +0 -3.6 TD +[(user)20 (-de\336ned)]TJ +0 -1 TD +(\336elds)Tj +ET +394.26 694.67 m +376.26 703.67 l +394.26 694.67 m +376.26 685.67 l +S +BT +10 0 0 10 396.51 692.09 Tm +[(header w)10 (ords)]TJ +-23.625 -0.325 TD +(all synchronization)Tj +0 -1 TD +(is done on this \336eld)Tj +ET +0 J +1.08 w +275.23 683.03 m +284.45 685.67 l +275.23 688.31 l +275.23 685.67 l +s +275.23 683.03 m +284.45 685.67 l +275.23 688.31 l +275.23 685.67 l +f* +2 J +0.6 w +274.73 685.67 m +241.26 685.67 l +S +BT +10 0 0 10 295.26 613.67 Tm +(object)Tj +12 0 0 12 229.34 591.17 Tm +(Figure 2. Object layout in EVM)Tj +ET +1 g +1.08 w +81 526.67 219 -18.5 re +B* +207.5 508.67 m +207.5 526.67 l +252.5 508.67 m +252.5 526.67 l +S +BT +10 0 0 10 123.07 514.34 Tm +0 g +(hash code)Tj +9.793 0.083 TD +[(age)-2112 (Lock \(00\))]TJ +-9.15 -1.8 TD +(25 bits)Tj +9.05 0.05 TD +[(5 bits)-2155 (2 bits)]TJ +ET +1 g +312.86 464.17 215.71 -18.5 re +B* +481.78 446.17 m +481.78 464.17 l +S +BT +9.85 0 0 10 342.41 452.67 Tm +0 g +[(Ex)16 (ecution en)39 (vironment \(EE\))-3057 (Lock \(11\))]TJ +4.1492 -1.85 TD +(30 bits)Tj +11.1005 0.1 TD +(2 bits)Tj +ET +1 g +312 527.17 219 -18.5 re +B* +483.5 509.17 m +483.5 527.17 l +S +BT +10 0 0 10 342 515.67 Tm +0 g +[(LockRecord \(LR\))-7363 (Lock \(10\))]TJ +4.15 -1.85 TD +(30 bits)Tj +11.1 0.1 TD +(2 bits)Tj +ET +1 g +81 464.17 219 -18.5 re +B* +252.5 446.17 m +252.5 464.17 l +S +BT +10 0 0 10 111 452.67 Tm +0 g +[(LockRecord \(LR\))-7363 (Lock \(01\))]TJ +4.15 -1.85 TD +(30 bits)Tj +11.1 0.1 TD +(2 bits)Tj +/N95 1 Tf +-18.25 3.44 TD +(LOCKED)Tj +/N92 1 Tf +12 0 0 12 172.73 406 Tm +[(Figure 3. Possible states for an object\325)56 (s multi-use w)10 (ord)]TJ +/N95 1 Tf +10 0 0 10 315 469.57 Tm +(BUSY)Tj +-23.4 6.3 TD +[(NEUTRAL)-19200 (WAITERS)]TJ +ET +PDFVars/TermAll get exec end end +userdict /pgsave get restore +showpage +%%PageTrailer +%%EndPage +%%Page: 12 12 +%%BeginPageSetup +userdict /pgsave save put +PDFVars begin PDF begin PDFVars/InitAll get exec +%%EndPageSetup +0 0 612 792 RC +1 g +/N93 /ExtGState findRes gs +1 i +0 0 612 792 re +f +BT +/N92 1 Tf +12 0 0 12 303 34.17 Tm +0 g +0 Tc +0 Tw +(9)Tj +-19.25 31.0417 TD +[(an)15 (y)-314 (state)-315 (other)-314 (than)]TJ +/N95 1 Tf +8.2975 0 TD +(BUSY)Tj +/N92 1 Tf +2.2708 0 TD +[(,)-315 (the)-315 (thread)-314 (has)-315 (acquired)-314 (the)-315 (meta-lock)-314 (and)-315 (may)-315 (proceed.)-314 (Ho)25 (we)25 (v)15 (e)0 (r)40 (,)-315 (if)]TJ +-10.5683 -1.1667 TD +[(the)-305 (returned)-305 (w)10 (ord\325)55 (s)-305 (l)0 (o)25 (w-order)-305 (bits)-305 (indicate)-305 (the)-305 (object)-305 (is)]TJ +/N95 1 Tf +22.8733 0 TD +(BUSY)Tj +/N92 1 Tf +2.2708 0 TD +[(,)-305 (then)-305 (some)-305 (other)-305 (thread)-305 (holds)-305 (the)]TJ +-25.1442 -1.1667 TD +[(meta-lock,)-307 (so)-307 (the)-307 (current)-306 (thread)-307 (in)40 (v)20 (o)0 (k)10 (e)0 (s)]TJ +/N95 1 Tf +16.5733 0 TD +(getMetaLockSlow\(\))Tj +/N92 1 Tf +10.2 0 TD +[(.)-307 (I)0 (n)-307 (this)-307 (case,)-306 (the)-307 (threads)-307 (con-)]TJ +-26.7733 -1.1667 TD +[(tending)-237 (for)-236 (the)-236 (meta-lock)-236 (are)-236 (totally)-236 (ordered)-236 (by)-237 (the)-236 (order)-237 (in)-236 (which)-236 (the)-237 (sw)10 (ap)-236 (instructions)-236 (occurred.)]TJ +T* +[(The)-232 (\336rst)-231 (thread)-232 (in)-231 (this)-232 (order)-231 (kno)25 (ws,)-231 (since)-232 (it)-231 (acquired)-231 (the)-231 (meta-lock,)-231 (that)-232 (it)-231 (has)-231 (no)-232 (predecessor)40 (,)-231 (and)]TJ +T* +[(e)25 (v)15 (ery other thread kno)26 (ws its predecessor from the)]TJ +/N95 1 Tf +20.1517 0 TD +(EE)Tj +/N92 1 Tf +1.2 0 TD +[( in the w)10 (ord read by the sw)11 (ap.)]TJ +-21.3517 -1.75 TD +[(A)-328 (thread)-328 (acquires)-328 (the)-328 (meta-lock)-327 (to)-329 (perform)-327 (some)-328 (operation)-328 (on)-327 (the)-328 (synchronization)-327 (data.)-328 (At)-328 (the)]TJ +0 -1.1667 TD +[(end)-248 (of)-248 (this)-247 (operation,)-248 (it)-247 (releases)-247 (the)-248 (meta-lock)-247 (and)-248 (sets)-247 (the)-248 (multi-use)-247 (w)9 (ord)-248 (of)-247 (the)-248 (object)-248 (to)-248 (a)-248 (v)25 (alue)]TJ +T* +[(appropriate)-315 (to)-314 (the)-315 (ne)25 (w)-315 (state)-315 (of)-314 (the)-315 (synchronization)-314 (data.)-315 (This)-315 (ne)25 (w)-315 (v)25 (alue)-315 (may)-315 (be)-314 (an)15 (y)-315 (non-)]TJ +/N95 1 Tf +36.6 0 TD +(BUSY)Tj +/N92 1 Tf +-36.6 -1.1667 TD +[(v)25 (alue;)-295 (the)-295 (operation)-295 (might)-295 (release)-296 (the)-295 (lock)-295 (and)-295 (restore)-296 (the)-295 (displaced)-295 (multi-use)-295 (w)10 (ord)-295 (bits)-295 (and)-296 (the)]TJ +/N95 1 Tf +T* +(NEUTRAL)Tj +/N92 1 Tf +4.555 0 TD +[(lock)-355 (state,)-354 (or)-354 (it)-355 (might)-355 (change)-355 (the)-355 (queue)-354 (pointer)-354 (to)-355 (point)-354 (to)-355 (a)-354 (n)0 (e)25 (w)-355 (lock)-355 (record,)-355 (or)-354 (it)]TJ +-4.555 -1.1667 TD +[(might)-334 (lea)20 (v)15 (e)-334 (the)-335 (multi-use)-334 (w)10 (ord)-334 (unchanged.)-334 (In)-334 (an)15 (y)-334 (case,)-335 (call)-334 (this)-334 (ne)25 (w)-334 (multi-use)-334 (w)10 (ord)-334 (v)25 (alue)-335 (the)]TJ +/N94 1 Tf +T* +[(r)37 (elease)-293 (bits)]TJ +/N92 1 Tf +4.8833 0 TD +[(of)-294 (the)-293 (operation.)-294 (T)80 (o)-293 (accomplish)-294 (the)-293 (meta-lock)-293 (release,)-294 (a)-293 (thread)-293 (uses)-293 (an)-293 (atomic)-294 (com-)]TJ +-4.8833 -1.1667 TD +[(pare-and-sw)11 (ap)-287 (\()]TJ +/N95 1 Tf +6.4958 0 TD +(CAS)Tj +/N92 1 Tf +1.8 0 TD +[(\))-286 (operation)-287 (to)-286 (atomically)-287 (compare)-286 (the)-287 (current)-286 (contents)-287 (of)-286 (the)-287 (object\325)55 (s)-287 (multi-)]TJ +-8.2958 -1.1667 TD +[(use)-254 (w)10 (ord)-254 (with)-254 (what)-253 (it)-254 (had)-253 (written)-254 (there)-253 (\(i.e.,)-254 (its)]TJ +/N95 1 Tf +19.22 0 TD +(EE)Tj +/N92 1 Tf +1.4542 0 TD +[(and)-253 (the)]TJ +/N95 1 Tf +3.1733 0 TD +(BUSY)Tj +/N92 1 Tf +2.6542 0 TD +[(state\))-253 (and,)-254 (if)-253 (it)-254 (is)-254 (still)-254 (the)-254 (same,)]TJ +-26.5017 -1.1667 TD +[(write)-286 (the)-285 (release)-286 (bits.)-286 (If)-286 (the)-286 (comparison)-285 (f)11 (ails,)-286 (then)-285 (some)-286 (other)-286 (thread)-286 (has)-286 (attempted)-285 (to)-286 (obtain)-286 (the)]TJ +T* +[(meta-lock)-347 (and)-348 (is)-346 (no)25 (w)-347 (w)9 (aiting)-347 (for)-346 (its)-347 (turn.)-347 (In)-347 (this)-347 (case,)-347 (the)-347 (releasing)-346 (thread)-347 (will)-347 (\322hand)-347 (of)25 (f\323)-347 (the)]TJ +T* +[(meta-lock)-287 (to)-288 (the)-287 (ne)15 (xt)-288 (thread)-288 (in)-287 (the)-288 (order)-287 (induced)-287 (by)-287 (the)-288 (sw)10 (ap)-288 (operations,)-287 (by)-288 (calling)]TJ +/N95 1 Tf +34.2 0 TD +(release-)Tj +-34.2 -1.1667 TD +(MetaLockSlow\(\))Tj +/N92 1 Tf +8.4 0 TD +[(.)-290 (The)-289 (aim)-290 (is)-290 (to)-290 (reach)-289 (the)-290 (state)-289 (that)-290 (w)10 (ould)-290 (ha)21 (v)15 (e)-290 (been)-289 (reached)-290 (if)-290 (the)-290 (releasing)]TJ +-8.4 -1.1667 TD +[(thread)-281 (had)-282 (completed)-281 (its)-282 (meta-lock)-281 (release,)-281 (writing)-282 (out)-281 (its)-282 (non-)]TJ +/N95 1 Tf +25.5567 0 TD +(BUSY)Tj +/N92 1 Tf +2.6817 0 TD +[(release)-281 (bits,)-282 (before)-281 (its)-282 (suc-)]TJ +-28.2383 -1.1667 TD +[(cessor)-429 (performed)-430 (its)-430 (atomic)-430 (sw)10 (ap)-430 (operation,)-429 (causing)-430 (that)-429 (operation)-429 (to)-430 (read)-430 (the)-430 (predecessor\325)56 (s)]TJ +T* +(release bits.)Tj +0 -1.75 TD +[(The)-242 (main)-242 (complication)-241 (in)-241 (the)-241 (slo)25 (w-path)-241 (meta-lock)-241 (hand-of)25 (f)-242 (i)0 (s)-241 (that)-242 (each)-241 (thread)-241 (in)-242 (the)-241 (atomic)-241 (sw)10 (ap)]TJ +0 -1.1667 TD +[(total)-304 (order)-304 (kno)25 (ws)-304 (the)-305 (identity)-304 (of)-304 (its)-304 (predecessor)40 (,)-304 (b)20 (ut)-304 (not)-304 (of)-304 (its)-304 (successor)56 (.)-305 (\(Note)-304 (that)-304 (the)-305 (changed)]TJ +3.8617 44.6525 TD +[(Figure)-241 (4.)-241 (A)-241 (lock)-241 (record)-242 (and)-241 (ho)25 (w)-241 (the)15 (y)-241 (are)-241 (chained)-242 (out)-241 (of)-241 (the)-241 (multi-use)-241 (w)10 (ord)-241 (of)]TJ +13.8192 -1 TD +(an object)Tj +ET +0 G +2 J +0 j +0.6 w +3.86 M +[]0 d +315 711 54 -72 re +369 693 m +315 693 l +369 675 m +315 675 l +S +BT +/N95 1 Tf +7 0 0 7 81 706.33 Tm +(typedef struct LockRecord_s LockRecord;)Tj +0 -1.1428 TD +(struct LockRecord_s {)Tj +1.2 -1.1429 TD +[(ExecEnv *owner;)-4200 (/* Owner thread.)-3600 (*/)]TJ +T* +[(int)-3600 (lockCount;)-1800 (/* # recursive locks. */)]TJ +T* +[(BitField storedBits;)-1200 (/* Hash and age.)-3000 ( */)]TJ +T* +[(LockRecord *queue;)-2400 (/* Lock queue on obj. */)]TJ +T* +[(LockRecord *nextFree; /* Free-list.)-5400 (*/)]TJ +-1.2 -1.1429 TD +(};)Tj +/N92 1 Tf +10 0 0 10 323 700.17 Tm +(class ptr)Tj +-0.8 -7.017 TD +[(lock)10 (ed object)]TJ +ET +0 J +1.08 w +395.12 696.43 m +403.26 701.5 l +393.66 701.51 l +394.39 698.97 l +s +395.12 696.43 m +403.26 701.5 l +393.66 701.51 l +394.39 698.97 l +f* +2 J +0.6 w +393.91 698.83 m +342 684 l +353.5 675 m +353.5 693 l +S +BT +10 0 0 10 357.5 680.5 Tm +(01)Tj +ET +405 711 27 -27 re +S +BT +10 0 0 10 405.46 675.67 Tm +(lock)Tj +ET +0 J +1.08 w +440.88 695.24 m +448.33 701.29 l +438.81 700.11 l +439.85 697.68 l +s +440.88 695.24 m +448.33 701.29 l +438.81 700.11 l +439.85 697.68 l +f* +2 J +0.6 w +439.39 697.48 m +423 690.5 l +S +BT +10 0 0 10 405.46 665.67 Tm +(record)Tj +ET +450 711 27 -27 re +S +BT +10 0 0 10 450.46 675.67 Tm +(lock)Tj +ET +0 J +1.08 w +485.88 695.24 m +493.33 701.29 l +483.81 700.11 l +484.85 697.68 l +s +485.88 695.24 m +493.33 701.29 l +483.81 700.11 l +484.85 697.68 l +f* +2 J +0.6 w +484.39 697.48 m +468 690.5 l +S +BT +10 0 0 10 450.46 665.67 Tm +(record)Tj +ET +495 711 27 -27 re +S +BT +10 0 0 10 495.46 675.67 Tm +(lock)Tj +0 -1 TD +(record)Tj +/N95 1 Tf +7 0 0 7 84.5 537.95 Tm +(BitField getMetaLock\(ExecEnv *ee, Object *obj\) {)Tj +1.2 -2.2857 TD +(BitField busyBits = ee | BUSY;)Tj +0 -1.1429 TD +(BitField lockBits =)Tj +1.2 -1.1429 TD +(SWAP\(busyBits, multiUseWordAddr\(obj\)\);)Tj +-1.2 -1.1429 TD +(return getLockState\(lockBits\) != BUSY ?)Tj +1.2 -1.1429 TD +(lockBits : getMetaLockSlow\(ee, lockBits\);)Tj +-2.4 -1.1429 TD +(})Tj +31.9486 8 TD +(void releaseMetaLock\(ExecEnv *ee, Object *obj,)Tj +12.6 -1.1429 TD +(BitField releaseBits\) {)Tj +-11.4 -1.1428 TD +(BitField busyBits = ee | BUSY;)Tj +T* +(BitField lockBits = CAS\(releaseBits, busyBits,)Tj +14.4 -1.1429 TD +(multiUseWordAddr\(obj\)\);)Tj +-14.4 -1.1429 TD +(if \(lockBits != busyBits\))Tj +1.2 -1.1429 TD +(releaseMetaLockSlow\(ee, releaseBits\);)Tj +-2.4 -1.1429 TD +(})Tj +/N92 1 Tf +12 0 0 12 195.77 458.33 Tm +[(Figure 5. F)16 (ast paths for meta-lock operations)]TJ +ET +PDFVars/TermAll get exec end end +userdict /pgsave get restore +showpage +%%PageTrailer +%%EndPage +%%Page: 13 13 +%%BeginPageSetup +userdict /pgsave save put +PDFVars begin PDF begin PDFVars/InitAll get exec +%%EndPageSetup +0 0 612 792 RC +1 g +/N93 /ExtGState findRes gs +1 i +0 0 612 792 re +f +BT +/N92 1 Tf +12 0 0 12 300 34.17 Tm +0 g +0 Tc +(10)Tj +-19 56.4858 TD +[(multi-use)-242 (w)10 (ord)-242 (v)25 (alue)-242 (that)-242 (causes)-242 (the)-242 (f)10 (ast)-242 (path)-242 (of)-242 (meta-lock)-242 (release)-242 (to)-242 (f)11 (ail)-242 (is)]TJ +/N94 1 Tf +30.2192 0 TD +(not)Tj +/N92 1 Tf +1.52 0 TD +[(necessarily)-243 (that)-242 (of)]TJ +-31.7392 -1.1667 TD +[(the)-410 (releasing)-411 (thread\325)55 (s)-410 (successor;)-411 (se)25 (v)15 (eral)-411 (threads)-410 (may)-410 (ha)20 (v)15 (e)-411 (performed)-411 (sw)10 (aps)-410 (since)-411 (the)-411 (current)]TJ +T* +[(thread)-316 (acquired)-316 (the)-315 (meta-lock,)-316 (and)-316 (the)-316 (v)25 (alue)-316 (present)-316 (will)-316 (re\337ect)-316 (the)]TJ +/N95 1 Tf +27.8608 0 TD +(EE)Tj +/N92 1 Tf +1.5158 0 TD +[(of)-316 (the)-316 (last)-316 (such)-316 (thread.\))]TJ +-29.3767 -1.1667 TD +[(Because)-313 (of)-313 (this)-313 (asymmetry)65 (,)-313 (the)-313 (hand-of)25 (f)-313 (from)-313 (a)-313 (predecessor)-313 (to)-313 (its)-313 (successor)-313 (synchronizes)-312 (using)]TJ +T* +[(state)-332 (in)-331 (the)-332 (predecessor\325)56 (s)]TJ +/N95 1 Tf +10.5467 0 TD +(EE)Tj +/N92 1 Tf +1.2 0 TD +[(.)-332 (A)0 (s)-331 (w)0 (e)-332 (s)0 (a)16 (w)-332 (in)-331 (Figure)-250 (1,)-332 (that)-332 (state)-331 (includes)-332 (a)-331 (mute)14 (x)-332 (and)-331 (condition)]TJ +-11.7467 -1.1667 TD +[(v)25 (ariable)-279 (pair)]TJ +/N95 1 Tf +5.3092 0 TD +(metaLockMutex)Tj +/N92 1 Tf +7.8 0 TD +0 Tw +(/)Tj +/N95 1 Tf +0.2783 0 TD +(metaLockCondvar)Tj +/N92 1 Tf +8.96 0 TD +[(,)-279 (a)-279 (\336eld)-279 (to)-279 (record)-279 (the)-279 (successor\325)55 (s)]TJ +/N95 1 Tf +13.4792 0 TD +(EE)Tj +/N92 1 Tf +1.2 0 TD +[(,)-279 (and)]TJ +-37.0267 -1.1667 TD +[(se)25 (v)15 (eral)-252 (booleans)-251 (used)-252 (to)-252 (coordinate)-251 (the)-252 (transfer)-252 (of)-252 (the)-252 (v)25 (alue)-252 (of)-251 (the)-252 (release)-252 (bits.)-252 (The)-252 (mute)15 (x)-253 (i)0 (s)-251 (used)]TJ +T* +[(to)-271 (ensure)-271 (that)-272 (the)-270 (threads)-271 (participating)-271 (in)-271 (the)-270 (hand-of)25 (f)-271 (update)-271 (the)-271 (other)-271 (\336elds)-271 (in)-271 (the)-270 (correct)-271 (order)55 (.)]TJ +T* +[(The)-310 (condition)-310 (v)25 (ariable)-310 (is)-310 (used)-310 (to)-309 (block)-310 (whiche)25 (v)15 (e)0 (r)-310 (thread)-310 (enters)-310 (the)-310 (hand-of)25 (f)-310 (\336rst)-311 (until)-309 (the)-310 (other)]TJ +T* +(thread is ready to complete the transaction.)Tj +0 -1.75 TD +[(The)-247 (hand-of)25 (f)-248 (protocol)-248 (proceeds)-247 (in)-248 (one)-248 (of)-248 (tw)10 (o)-247 (w)9 (ays,)-248 (as)-248 (sho)25 (wn)-248 (in)-248 (Figure)-250 (6.)-247 (The)-248 (predecessor)-247 (thread)]TJ +0 -1.1667 TD +[(releasing)-238 (the)-239 (meta-lock)-239 (and)-238 (the)-239 (successor)-239 (thread)-238 (attempting)-239 (to)-239 (acquire)-239 (it)-238 (\322race\323)-239 (to)-239 (acquire)]TJ +/N95 1 Tf +36 0 TD +(meta-)Tj +-36 -1.1667 TD +(LockMutex)Tj +/N92 1 Tf +5.63 0 TD +[(in)-230 (the)-230 (predecessor)-231 (thread\325)55 (s)]TJ +/N95 1 Tf +10.8075 0 TD +(EE)Tj +/N92 1 Tf +1.2 0 TD +[(.)-231 (I)0 (f)-230 (the)-230 (predecessor)-231 (thread)-230 (wins)-230 (the)-230 (race,)-231 (it)-230 (will)-231 (set)-231 (the)]TJ +/N95 1 Tf +-17.6375 -1.1667 TD +(bitsForGrab)Tj +/N92 1 Tf +6.8858 0 TD +[(\336eld)-286 (in)-285 (its)]TJ +/N95 1 Tf +4.3583 0 TD +(EE)Tj +/N92 1 Tf +1.4858 0 TD +(to)Tj +/N95 1 Tf +1.0642 0 TD +(TRUE)Tj +/N92 1 Tf +2.4 0 TD +[(.)-286 (I)0 (f)-286 (the)-286 (successor)-285 (thread)-286 (wins)-286 (the)-285 (race,)-287 (it)-286 (will)-286 (change)-286 (the)]TJ +/N95 1 Tf +-16.1942 -1.1667 TD +(succEE)Tj +/N92 1 Tf +3.9225 0 TD +[(\336eld)-322 (of)-322 (its)-322 (predecessor\325)55 (s)]TJ +/N95 1 Tf +10.2308 0 TD +(EE)Tj +/N92 1 Tf +1.5225 0 TD +[(from)-322 (the)-322 (def)10 (ault)-322 (v)25 (alue)-322 (of)]TJ +/N95 1 Tf +10.5175 0 TD +(NULL)Tj +/N92 1 Tf +2.7217 0 TD +[(to)-323 (the)-321 (address)-323 (of)-322 (its)-322 (o)25 (w)0 (n)]TJ +/N95 1 Tf +-28.915 -1.1667 TD +(EE)Tj +/N92 1 Tf +1.2 0 TD +[(.)-303 (Thus,)-302 (each)-303 (thread)-303 (may)-303 (determine)-302 (whether)-302 (it)-302 (w)9 (o)0 (n)-302 (the)-303 (race)-302 (by)-302 (noting)-302 (whether)-303 (the)-302 (competitor)]TJ +-1.2 -1.1667 TD +(has made the corresponding change.)Tj +/N94 1 Tf +0 -1.75 TD +[(Case)-238 (1:)-239 (successor)-238 (\(acquiring\))-238 (thr)37 (ead)-238 (wins)-238 (r)16 (ace)15 (.)]TJ +/N92 1 Tf +19.35 0 TD +[(When)-239 (the)-238 (successor)-239 (thread)-238 (obtains)-238 (the)-238 (mute)15 (x)-238 (and)]TJ +-19.35 -1.1667 TD +[(its)-228 (predecessor\325)56 (s)]TJ +/N95 1 Tf +6.7883 0 TD +(bitsForGrab)Tj +/N92 1 Tf +6.8283 0 TD +[(\336eld)-228 (is)]TJ +/N95 1 Tf +2.9017 0 TD +(FALSE)Tj +/N92 1 Tf +3 0 TD +[(,)-227 (i)0 (t)-229 (kno)25 (ws)-228 (it)-228 (acquired)-228 (the)-228 (mute)15 (x)-228 (before)-228 (the)-228 (prede-)]TJ +-19.5183 -1.1667 TD +[(cessor)55 (.)-247 (I)0 (n)-247 (this)-247 (e)25 (v)15 (ent,)-247 (it)-247 (updates)-247 (the)-247 (predecessors\325)55 (s)]TJ +/N95 1 Tf +19.9342 0 TD +(succEE)Tj +/N92 1 Tf +3.6 0 TD +[(,)-247 (and)-248 (w)10 (aits)-248 (for)-246 (the)-247 (predecessor)-247 (to)-247 (com-)]TJ +-23.5342 -1.1667 TD +[(plete)-241 (the)-240 (transaction.)-240 (When)-241 (the)-240 (predecessor)-241 (acquires)-240 (the)-240 (mute)15 (x,)-241 (it)-241 (notes)-241 (from)-241 (the)-240 (non-)]TJ +/N95 1 Tf +34.2183 0 TD +(NULL)Tj +/N92 1 Tf +2.6408 0 TD +[(v)25 (alue)]TJ +-36.8592 -1.1667 TD +[(in)-274 (its)]TJ +/N95 1 Tf +2.2725 0 TD +(succEE)Tj +/N92 1 Tf +3.875 0 TD +[(\336eld)-274 (that)-275 (the)-275 (successor)-274 (went)-274 (\336rst,)-275 (and)-275 (therefore)-275 (completes)-274 (the)-275 (meta-lock)-274 (hand-of)25 (f)]TJ +-6.1475 -1.1667 TD +[(by)-317 (setting)-318 (the)-318 (successor\325)55 (s)]TJ +/N95 1 Tf +10.6592 0 TD +(metaLockBits)Tj +/N92 1 Tf +7.5175 0 TD +[(to)-318 (the)-318 (release)-318 (bits,)-317 (setting)]TJ +/N95 1 Tf +10.7275 0 TD +(gotMetaLockSlow)Tj +/N92 1 Tf +9.3175 0 TD +(to)Tj +-38.2217 -1.1667 TD +[(indicate)-313 (that)-313 (those)-313 (bits)-313 (are)-314 (v)25 (alid,)-313 (and)-313 (w)10 (aking)-313 (the)-314 (successor)-313 (by)-313 (signalling)]TJ +/N95 1 Tf +29.805 0 TD +(metaLockCondvar)Tj +/N92 1 Tf +8.945 0 TD +(.)Tj +-38.75 -1.1667 TD +[(Finally)65 (,)-314 (the)-314 (predecessor)-315 (releases)-314 (the)-314 (mute)15 (x,)-314 (allo)25 (wing)-314 (the)-315 (successor)-314 (to)-314 (continue,)-314 (ha)20 (ving)-314 (acquired)]TJ +T* +(the meta-lock.)Tj +/N94 1 Tf +0 -1.75 TD +[(Case)-292 (2:)-291 (pr)37 (edecessor)-292 (\(r)37 (eleasing\))-290 (thr)37 (ead)-292 (wins)-291 (r)15 (ace)15 (.)]TJ +/N92 1 Tf +20.4233 0 TD +[(Here)-292 (the)-291 (predecessor)-291 (thread)-292 (determines)-291 (that)-292 (it)]TJ +-20.4233 -1.1667 TD +[(acquired)-272 (the)-272 (mute)14 (x)-272 (\336rst)-272 (by)-272 (noting)-272 (that)-272 (its)]TJ +/N95 1 Tf +16.885 0 TD +(succEE)Tj +/N92 1 Tf +3.8725 0 TD +[(\336eld)-272 (is)-272 (still)]TJ +/N95 1 Tf +4.7625 0 TD +(NULL)Tj +/N92 1 Tf +2.4 0 TD +[(.)-272 (I)0 (t)-272 (does)-272 (not)-273 (kno)25 (w)-272 (the)-272 (iden-)]TJ +-27.92 -1.1667 TD +[(tity)-329 (of)-329 (its)-329 (successor)41 (,)-329 (b)20 (ut)-329 (it)-329 (kno)25 (ws)-329 (that)-329 (the)-329 (successor)-329 (kno)25 (ws)-329 (its)-329 (identity)65 (.)-329 (Thus,)-328 (it)-329 (sets)-329 (the)]TJ +/N95 1 Tf +36 0 TD +(meta-)Tj +-36 -1.1667 TD +(LockBits)Tj +/N92 1 Tf +5.1058 0 TD +[(\336eld)-307 (of)-306 (its)]TJ +/N95 1 Tf +4.475 0 TD +(EE)Tj +/N92 1 Tf +1.5067 0 TD +[(to)-306 (the)-306 (proper)-306 (release)-306 (bits)-306 (v)25 (alue,)-306 (and)-306 (sets)-307 (the)]TJ +/N95 1 Tf +18.1442 0 TD +(bitsForGrab)Tj +/N92 1 Tf +6.9058 0 TD +[(\336eld)-306 (to)]TJ +/N95 1 Tf +-36.1375 -1.1667 TD +(TRUE)Tj +/N92 1 Tf +2.625 0 TD +[(to)-226 (indicate)-225 (that)-226 (those)-226 (bits)-225 (are)-226 (v)25 (alid,)-226 (and)-225 (w)10 (aits)-225 (for)-226 (the)-225 (successor)-225 (to)-225 (read)-226 (the)-225 (bits)-225 (\(releasing)-226 (the)]TJ +-2.625 -1.1667 TD +[(mute)15 (x)-264 (i)0 (n)-264 (the)-265 (process\).)-264 (The)-264 (successor)-264 (thread)-264 (obtains)-265 (the)-264 (mute)15 (x,)-265 (sees)-264 (that)-265 (its)-264 (predecessor\325)55 (s)]TJ +/N95 1 Tf +36 0 TD +(bits-)Tj +-36 -1.1667 TD +(ForGrab)Tj +/N92 1 Tf +4.5008 0 TD +(is)Tj +/N95 1 Tf +0.9675 0 TD +(TRUE)Tj +/N92 1 Tf +2.4 0 TD +[(,)-301 (and)-300 (thus)-300 (realizes)-301 (that)-301 (it)-301 (has)-300 (acquired)-300 (the)-300 (mute)15 (x)-301 (second,)-300 (and)-301 (that)-301 (the)-300 (release)]TJ +-7.8683 -1.1667 TD +[(bits)-243 (are)-243 (a)20 (v)25 (ailable)-243 (in)-244 (its)-242 (predecessor\325)55 (s)]TJ +/N95 1 Tf +14.8008 0 TD +(metaLockBits)Tj +/N92 1 Tf +7.4433 0 TD +[(\336eld.)-243 (It)-243 (copies)-243 (those)-244 (bits,)-242 (resets)-244 (the)-243 (prede-)]TJ +-22.2442 -1.1667 TD +[(cessor\325)55 (s)]TJ +/N95 1 Tf +3.4392 0 TD +(bitsForGrab)Tj +/N92 1 Tf +6.8733 0 TD +[(to)-274 (the)-273 (def)10 (ault)-273 (v)25 (alue)-273 (of)]TJ +/N95 1 Tf +9.1075 0 TD +(FALSE)Tj +/N92 1 Tf +3.2733 0 TD +[(to)-273 (indicate)-274 (that)-272 (the)-274 (hand-of)25 (f)-273 (i)0 (s)-274 (complete,)]TJ +-22.6933 -1.1667 TD +[(signals)-426 (its)-426 (predecessor\325)55 (s)-426 (condition)-426 (v)25 (ariable)-426 (to)-426 (inform)-425 (it)-426 (of)-426 (that)-426 (f)10 (act,)-426 (and,)-426 (\336nally)65 (,)-426 (releases)-426 (the)]TJ +T* +[(mute)15 (x.)]TJ +0 -1.75 TD +[(The)-317 (meta-lock)-317 (protocol)-317 (guarantees)-316 (that)-317 (threads)-316 (obtain)-317 (the)-316 (meta-lock)-317 (in)-317 (the)-317 (order)-317 (determined)-317 (by)]TJ +0 -1.1667 TD +[(the)-240 (e)16 (x)15 (ecution)-239 (of)-239 (the)-240 (atomic)-239 (sw)9 (ap)-239 (operations.)-239 (A)-240 (thread)-239 (need)-239 (only)-239 (w)9 (ait)-239 (for)-239 (threads)-239 (ahead)-240 (of)-239 (it)-240 (in)-239 (the)]TJ +T* +[(sw)10 (ap)-254 (order)40 (,)-253 (s)0 (o)-254 (i)0 (f)-254 (n)0 (o)-253 (thread)-254 (blocks)-253 (inde\336nitely)-254 (while)-254 (holding)-254 (the)-254 (meta-lock,)-254 (all)-253 (threads)-254 (attempting)]TJ +T* +[(to acquire the meta-lock will e)26 (v)15 (entually succeed.)]TJ +0 -1.75 TD +[(Armed)-300 (with)-299 (this)-299 (meta-lock,)-299 (we)-299 (no)25 (w)-300 (proceed)-299 (to)-299 (implement)-298 (the)-300 (monitor)-299 (operations:)-299 (lock,)-300 (unlock,)]TJ +0 -1.1667 TD +[(w)10 (ait,)-445 (and)-444 (notify)65 (.)-444 (Because)-444 (the)-445 (meta-lock)-444 (arbitrates)-445 (access)-444 (among)-445 (contending)-444 (threads,)-444 (we)-444 (can)]TJ +T* +[(implement)-323 (monitor)-323 (operations)-323 (using)-324 (a)-324 (number)-323 (of)-323 (dif)25 (ferent)-324 (data)-323 (structures)-323 (and)-323 (of)25 (fer)-323 (a)-323 (v)25 (ariety)-323 (of)]TJ +ET +PDFVars/TermAll get exec end end +userdict /pgsave get restore +showpage +%%PageTrailer +%%EndPage +%%Page: 14 14 +%%BeginPageSetup +userdict /pgsave save put +PDFVars begin PDF begin PDFVars/InitAll get exec +%%EndPageSetup +0 0 612 792 RC +1 g +/N93 /ExtGState findRes gs +1 i +0 0 612 792 re +f +BT +/N92 1 Tf +12 0 0 12 300 34.17 Tm +0 g +0 Tc +(11)Tj +-19 29.6525 TD +[(semantics)-329 ([7].)-329 (W)80 (e)-329 (ha)20 (v)15 (e)-329 (chosen)-329 (an)-329 (implementation)-328 (that)-329 (uses)-329 (a)-329 (simple)-329 (link)9 (ed)-328 (list)-329 (of)-329 (lock)-329 (records)]TJ +0 -1.1667 TD +0 Tw +[(and gi)25 (v)15 (es equal preference to a)16 (w)10 (ak)10 (ened w)10 (aiters and ne)26 (wly arri)25 (v)15 (ed threads.)]TJ +/N91 1 Tf +14 0 0 14 72 343.67 Tm +(4.3 Locking and unlocking objects)Tj +/N92 1 Tf +12 0 0 12 72 319 Tm +[(Acquiring)-379 (and)-379 (releasing)-379 (the)-380 (monitor)21 (-lock)-379 (of)-379 (an)-379 (object)-378 (corresponds)-379 (to)-379 (entering)-379 (and)-379 (e)15 (xiting)-379 (the)]TJ +T* +[(object\325)55 (s)-359 (monitor)55 (.)-359 (Figure)-250 (7)-360 (sho)25 (ws)-359 (the)-360 (f)10 (ast-path)-359 (implementation)-359 (for)-360 (these)-359 (operations.)-359 (Most)-360 (com-)]TJ +T* +[(monly)65 (,)-279 (the)-279 (object)-279 (being)-279 (monitor)21 (-lock)10 (ed)-279 (is)-279 (either)-279 (unlock)10 (ed\321in)-279 (a)]TJ +/N95 1 Tf +26.0975 0 TD +(NEUTRAL)Tj +/N92 1 Tf +4.4783 0 TD +(or)Tj +/N95 1 Tf +1.1125 0 TD +(WAITERS)Tj +/N92 1 Tf +4.4783 0 TD +(state\321)Tj +-36.1667 -1.1667 TD +[(or)-349 (being)-350 (lock)10 (ed)-350 (recursi)25 (v)15 (ely)65 (.)-350 (I)0 (n)-350 (these)-350 (cases,)-350 (the)-350 (locking)-350 (thread)-350 (simply)-349 (updates)-350 (the)-350 (object\325)55 (s)-350 (syn-)]TJ +T* +[(chronization)-360 (data;)-360 (it)-360 (need)-359 (not)-360 (interact)-360 (with)-359 (other)-360 (threads.)-360 (Lik)10 (e)25 (wise,)-359 (when)-360 (unlocking)-360 (an)-360 (object,)]TJ +T* +[(there)-269 (are)-269 (tw)10 (o)-269 (cases)-269 (that)-269 (in)40 (v)20 (olv)16 (e)-269 (n)0 (o)-269 (interaction)-269 (with)-269 (other)-269 (threads:)-270 (the)-269 (unlocking)-269 (thread)-269 (has)-269 (recur-)]TJ +T* +[(si)25 (v)15 (ely)-280 (lock)10 (ed)-279 (the)-280 (object,)-279 (in)-279 (which)-279 (case)-279 (it)-280 (simply)-279 (decrements)-279 (the)-280 (lock)-280 (count;)-279 (or)-279 (there)-279 (are)-280 (no)-279 (other)]TJ +T* +[(threads)-272 (attempting)-272 (to)-273 (acquire)-272 (a)-272 (singly-held)-272 (lock,)-272 (in)-272 (which)-273 (case)-272 (it)-272 (restores)-271 (the)-272 (displaced)-272 (multi-use)]TJ +T* +[(w)10 (ord v)25 (alue, which has the)]TJ +/N95 1 Tf +10.685 0 TD +(NEUTRAL)Tj +/N92 1 Tf +4.2 0 TD +( lock state.)Tj +-14.885 -1.75 TD +[(The)-332 (remaining)-332 (cases)-332 (in)40 (v)20 (olv)16 (e)-333 (threads)-333 (contending)-332 (for)-332 (the)-332 (monitor)20 (-lock;)-332 (see)-333 (Figure)-250 (8.)-332 (Much)-333 (as)-332 (in)]TJ +0 -1.1667 TD +[(meta-lock)-244 (hand-of)25 (f,)-244 (we)-245 (use)-244 (a)-244 (per)20 (-thread)-245 (mute)15 (x)-244 (and)-244 (condition)-244 (v)25 (ariable)-245 (to)-244 (coordinate)-244 (acquiring)-244 (and)]TJ +T* +[(releasing)-356 (threads.)-355 (When)-355 (a)-355 (thread)-356 (attempts)-355 (to)-356 (acquire)-355 (a)-355 (monitor)20 (-lock)-355 (b)20 (u)0 (t)-356 (\336nds)-355 (it)-356 (lock)10 (ed,)-355 (it)-356 (sus-)]TJ +T* +[(pends)-232 (on)-232 (a)-232 (condition)-233 (v)25 (ariable)-232 (in)-232 (its)-232 (o)25 (w)0 (n)]TJ +/N95 1 Tf +16.0292 0 TD +(EE)Tj +/N92 1 Tf +1.2 0 TD +[(,)-232 (w)10 (aiting)-232 (to)-232 (be)-233 (signalled)-232 (by)-232 (a)-233 (lock-releasing)-232 (thread)-232 (that)]TJ +-17.2292 -1.1667 TD +[(it)-285 (should)-285 (re-attempt)-285 (the)-285 (acquisition.)-285 (When)-284 (the)-285 (acquiring)-284 (thread)-285 (recei)25 (v)15 (e)0 (s)-284 (this)-285 (signal,)-285 (it)-285 (repeats)-285 (the)]TJ +T* +[(lock-acquisition)-250 (slo)25 (w)-250 (path:)-250 (it)-251 (acquires)-249 (the)-250 (object\325)55 (s)-250 (meta-lock)-250 (and)-250 (checks)-250 (the)-250 (object\325)55 (s)-250 (lock)-250 (state.)-250 (If)]TJ +T* +[(the)-384 (state)-384 (is)-385 (no)25 (w)-384 (dif)25 (ferent)-384 (from)]TJ +/N95 1 Tf +13.0842 0 TD +(LOCKED)Tj +/N92 1 Tf +3.6 0 TD +[(,)-384 (i)0 (t)-384 (adjusts)-384 (the)-384 (synchronization)-383 (data)-384 (to)-384 (indicate)-384 (that)-384 (it)]TJ +-16.6842 -1.1667 TD +[(holds)-267 (the)-267 (monitor)20 (-lock)-267 (and)-267 (releases)-267 (the)-267 (meta-lock;)-267 (if)-267 (the)-268 (state)-267 (is)]TJ +/N95 1 Tf +25.9725 0 TD +(LOCKED)Tj +/N92 1 Tf +3.6 0 TD +[(,)-268 (the)-267 (thread)-268 (releases)-267 (the)]TJ +-29.5725 -1.1667 TD +[(meta-lock and w)10 (aits again.)]TJ +/N95 1 Tf +8 0 0 8 309 705.67 Tm +(void releaseMetaLockSlow\(ExecEnv *ee,)Tj +15 -1.25 TD +[(BitField)-544 (releaseBits\))-544 ({)]TJ +-13.8 -1.25 TD +(/* We are in a race with our successor to)Tj +1.8 -1.25 TD +(lock ee->metaLockMutex; the winner of)Tj +T* +(the race waits for the loser. */)Tj +-1.8 -1.25 TD +(mutexLock\(&ee->metaLockMutex\);)Tj +T* +(if \(ee->succEE\) {)Tj +1.2 -1.25 TD +(/* Lost the race: */)Tj +T* +(assert\(!ee->succEE->bitsForGrab\);)Tj +T* +(assert\(!ee->bitsForGrab\);)Tj +T* +(assert\(!ee->succEE->gotMetaLockSlow\);)Tj +T* +[(ee->succEE->metaLockBits)-2400 (= releaseBits;)]TJ +T* +(ee->succEE->gotMetaLockSlow = TRUE;)Tj +T* +(ee->succEE = NULL;)Tj +T* +(condvarSignal\(&ee->metaLockCondvar\);)Tj +-1.2 -1.25 TD +(} else {)Tj +1.2 -1.25 TD +(/* Won the race: */)Tj +T* +(ee->metaLockBits = releaseBits;)Tj +T* +[(ee->bitsForGrab)-1200 (= TRUE;)]TJ +T* +(do {)Tj +1.2 -1.25 TD +(condvarWait\(&ee->metaLockCondvar,)Tj +7.2 -1.25 TD +(&ee->metaLockMutex\);)Tj +-8.4 -1.25 TD +(} while \(ee->bitsForGrab\);)Tj +-1.2 -1.25 TD +(})Tj +T* +(mutexUnlock\(&ee->metaLockMutex\);)Tj +-1.2 -1.25 TD +(})Tj +-29.125 31.25 TD +(BitField getMetaLockSlow\(ExecEnv *ee,)Tj +15 -1.25 TD +(BitField predBits\) {)Tj +-13.8 -1.25 TD +(BitField bits;)Tj +T* +(ExecEnv *predEE = busyEE\(predBits\);)Tj +T* +(assert\(getLockState\(predBits\) == BUSY\);)Tj +T* +(mutexLock\(&predEE->metaLockMutex\);)Tj +T* +(if \(!predEE->bitsForGrab\) {)Tj +1.2 -1.25 TD +(/* Won the race: */)Tj +T* +(predEE->succEE = ee;)Tj +T* +(do {)Tj +1.2 -1.25 TD +(condvarWait\(&predEE->metaLockCondvar,)Tj +7.2 -1.25 TD +(&predEE->metaLockMutex\);)Tj +-8.4 -1.25 TD +(} while \(!ee->gotMetaLockSlow\);)Tj +T* +(ee->gotMetaLockSlow = FALSE;)Tj +T* +(bits = ee->metaLockBits;)Tj +-1.2 -1.25 TD +(} else {)Tj +1.2 -1.25 TD +(/* Lost the race: */)Tj +T* +(bits = predEE->metaLockBits;)Tj +T* +(predEE->bitsForGrab = FALSE;)Tj +T* +(condvarSignal\(&predEE->metaLockCondvar\);)Tj +-1.2 -1.25 TD +(})Tj +T* +(mutexUnlock\(&predEE->metaLockMutex\);)Tj +T* +(return bits;)Tj +-1.2 -1.25 TD +(})Tj +/N92 1 Tf +12 0 0 12 196 433.67 Tm +[(Figure 6. Slo)25 (w paths for meta-lock operations)]TJ +ET +PDFVars/TermAll get exec end end +userdict /pgsave get restore +showpage +%%PageTrailer +%%EndPage +%%Page: 15 15 +%%BeginPageSetup +userdict /pgsave save put +PDFVars begin PDF begin PDFVars/InitAll get exec +%%EndPageSetup +0 0 612 792 RC +1 g +/N93 /ExtGState findRes gs +1 i +0 0 612 792 re +f +BT +/N92 1 Tf +12 0 0 12 300 34.17 Tm +0 g +0 Tc +(12)Tj +-19 12.5967 TD +[(T)80 (o)-248 (release)-247 (a)-248 (contended)-248 (monitor)21 (-lock,)-247 (a)-248 (thread)-248 (\336rst)-247 (obtains)-248 (the)-248 (meta-lock.)-247 (Then)-248 (it)-247 (remo)15 (v)15 (e)0 (s)-248 (its)-248 (o)25 (w)0 (n)]TJ +0 -1.1667 TD +[(lock)-341 (record)-342 (from)-342 (the)-341 (queue.)-342 (Subsequently)66 (,)-342 (i)0 (t)-341 (calls)]TJ +/N95 1 Tf +20.7758 0 TD +(wakeupEE\(\))Tj +/N92 1 Tf +6.3417 0 TD +[(to)-341 (\336nd)-341 (the)-342 (\336rst)-341 (thread)-342 (on)-342 (the)]TJ +-27.1175 -1.1667 TD +[(lock)-308 (queue)-308 (that)-307 (is)-308 (w)10 (aiting)-308 (to)-309 (acquire)-307 (the)-308 (lock.)-308 (If)-308 (there)-308 (is)-308 (such)-308 (a)-308 (thread,)-308 (it)-308 (is)-308 (signalled.)-307 (Then,)-308 (the)]TJ +T* +[(releasing)-330 (thread)-330 (performs)-331 (a)-330 (meta-lock)-330 (release)-331 (to)-329 (write)-331 (out)-330 (the)-330 (shortened)-330 (lock)-330 (queue)-330 (and)-330 (set)-331 (the)]TJ +T* +[(lock)-280 (state)-280 (to)]TJ +/N95 1 Tf +5.175 0 TD +(WAITERS)Tj +/N92 1 Tf +4.2 0 TD +[(.)-280 (\()0 (W)80 (e)-280 (ha)20 (v)15 (e)-280 (elided)-281 (code)-280 (that)-281 (optimizes)-280 (a)15 (w)10 (ay)-280 (redundant)-281 (signalling)-280 (on)-281 (w)10 (ait-)]TJ +-9.375 -1.1667 TD +[(ing)-245 (threads.\))-245 (Thus,)-245 (at)-245 (the)-245 (monitor)20 (-lock)-245 (le)25 (v)15 (el,)-245 (unlik)10 (e)-245 (the)-245 (meta-lock)-245 (le)25 (v)15 (el,)-245 (we)-245 (do)-245 (not)-245 (use)-244 (a)-245 (hand-of)25 (f:)]TJ +T* +[(the)-238 (releasing)-238 (thread)-238 (does)-238 (not)-239 (gi)26 (v)14 (e)-238 (the)-238 (monitor)20 (-lock)-238 (to)-238 (a)-238 (w)10 (aiting)-239 (thread)-238 (b)20 (u)0 (t)-238 (merely)-238 (in)40 (vites)-238 (the)-238 (w)10 (ait-)]TJ +T* +0 Tw +(ing thread to re-attempt the acquisition.)Tj +/N95 1 Tf +7 0 0 7 80.71 710.55 Tm +(bool_t monitorEnter\(ExecEnv *ee, Object *obj\) {)Tj +1.2 -1.1429 TD +[(BitField)-1201 (r)-3000 (=)0 ( getMetaLock\(ee, obj\);)]TJ +T* +(LockState state = lockState\(r\);)Tj +T* +(if \(state == NEUTRAL\) {)Tj +1.2 -1.1429 TD +(/* Establish locking by this thread. */)Tj +T* +(LockRecord *lr = allocLockRecord\(ee\);)Tj +T* +(lr->storedBits = r;)Tj +T* +(releaseMetaLock\(ee, obj, lr | LOCKED\);)Tj +-1.2 -1.1428 TD +(} else if \(state == LOCKED\) {)Tj +1.2 -1.1429 TD +(LockRecord *ownerLR = lockRecord\(r\);)Tj +T* +(if \(ownerLR->owner == ee\) {)Tj +1.2014 -1.1429 TD +(/* Recursive locking. */)Tj +T* +(ownerLR->lockCount++;)Tj +T* +(releaseMetaLock\(ee, obj, r\);)Tj +-1.2014 -1.1429 TD +(} else {)Tj +1.2014 -1.1429 TD +(LockRecord *lr = allocLockRecord\(ee\);)Tj +T* +(ownerLR->queue = appendToQueue\(ownerLR->queue,)Tj +18.6 -1.1428 TD +(lr\);)Tj +-18.6 -1.1429 TD +(monitorEnterSlow\(ee, obj, r\);)Tj +-1.2014 -1.1429 TD +(})Tj +-1.2 -1.1429 TD +(} else if \(state == WAITERS\) {)Tj +1.2 -1.1429 TD +(/* obj is unlocked but has threads waiting)Tj +1.8 -1.1429 TD +(for notification. */)Tj +-1.8 -1.1429 TD +(LockRecord *lr = allocLockRecord\(ee\);)Tj +T* +(LockRecord *firstWaiterLR = lockRecord\(r\);)Tj +T* +[(lr->queue)-3600 (= firstWaiterLR;)]TJ +T* +(lr->storedBits = firstWaiterLR->storedBits;)Tj +T* +(releaseMetaLock\(ee, obj, lr | LOCKED\);)Tj +-1.2 -1.1429 TD +(})Tj +T* +(return TRUE;)Tj +-1.2 -1.1429 TD +(})Tj +33.47 34.3871 TD +(bool_t monitorExit\(ExecEnv *ee, Object *obj\) {)Tj +1.2 -1.1429 TD +[(BitField)-2400 (r)-4200 (=)0 ( getMetaLock\(ee, obj\);)]TJ +T* +(LockRecord *ownerLR = lockRecord\(r\);)Tj +T* +[(LockState)-1800 (state)-1800 (= lockState\(r\);)]TJ +T* +(if \(state == LOCKED && ownerLR->owner == ee\) {)Tj +1.2 -1.1429 TD +(assert\(ownerLR->lockCount >= 1\);)Tj +T* +(if \(ownerLR->lockCount == 1\) {)Tj +1.2 -1.1429 TD +(/* Last release: will not have lock)Tj +1.8 -1.1429 TD +(after this operation. */)Tj +-1.8 -1.1429 TD +(if \(ownerLR->queue == NULL\) {)Tj +1.2 -1.1429 TD +(/* No-one waiting. */)Tj +T* +(assert\(lockState\(ownerLR->storedBits\))Tj +4.8 -1.1429 TD +(== NEUTRAL\);)Tj +-4.8 -1.1429 TD +(releaseMetaLock\(ee, obj,)Tj +9.6 -1.1429 TD +(ownerLR->storedBits\);)Tj +-10.8 -1.1429 TD +(} else {)Tj +1.2 -1.1429 TD +(/* There is a queue. Release)Tj +1.8 -1.1429 TD +(with wakeup call. */)Tj +-1.8 -1.1429 TD +(ownerLR->queue->storedBits =)Tj +12 -1.1429 TD +(ownerLR->storedBits;)Tj +-12 -1.1429 TD +(monitorExitSlow\(ee, obj, ownerLR->queue\);)Tj +T* +(ownerLR->queue = NULL;)Tj +-1.2 -1.1429 TD +(})Tj +T* +(recycleLockRecord\(ee, ownerLR\);)Tj +-1.2 -1.1429 TD +(} else {)Tj +1.2 -1.1429 TD +(/* Still has lock after this. */)Tj +T* +(ownerLR->lockCount--;)Tj +T* +(releaseMetaLock\(ee, obj, r\);)Tj +-1.2 -1.1429 TD +(})Tj +-1.2 -1.1429 TD +(} else {)Tj +1.2 -1.1429 TD +(releaseMetaLock\(ee, obj, r\);)Tj +T* +(throwIllegalMonitorStateException\(\);)Tj +T* +(return FALSE;)Tj +-1.2 -1.1429 TD +(})Tj +T* +(return TRUE;)Tj +-1.2 -1.1429 TD +(})Tj +/N92 1 Tf +12 0 0 12 191.05 413.5 Tm +[(Figure 7. F)15 (ast paths for monitor)20 (-lock operations)]TJ +/N95 1 Tf +7 0 0 7 81 368.43 Tm +(void monitorEnterSlow\(ExecEnv *ee, Object *obj,)Tj +13.2 -1.1429 TD +(BitField r\) {)Tj +-12 -1.1429 TD +(LockRecord *lr;)Tj +T* +(while \(lockState\(r\) == LOCKED\) {)Tj +1.2 -1.1429 TD +(mutexLock\(&ee->monitorLockMutex\);)Tj +T* +(releaseMetaLock\(ee, obj, r\);)Tj +T* +(condvarWait\(&ee->monitorLockCondvar,)Tj +7.2 -1.1429 TD +(&ee->monitorLockMutex\);)Tj +-7.2 -1.1429 TD +(mutexUnlock\(&ee->monitorLockMutex\);)Tj +T* +(r = getMetaLock\(ee, obj\);)Tj +-1.2 -1.1429 TD +(})Tj +T* +(assert\(lockState\(r\) == WAITERS\);)Tj +T* +(lr = moveMyLRToFront\(ee, lockRecord\(r\)\);)Tj +T* +(releaseMetaLock\(ee, obj, lr | LOCKED\);)Tj +-1.2 -1.1429 TD +(})Tj +33.4286 15.9386 TD +(void monitorExitSlow\(ExecEnv *ee, Object *obj,)Tj +12.6 -1.1429 TD +(LockRecord *lr\) {)Tj +-11.4 -1.1429 TD +(ExecEnv *wakeEE = wakeupEE\(lr\);)Tj +T* +(if \(wakeEE\) {)Tj +1.2 -1.1429 TD +(mutexLock\(&wakeEE->monitorLockMutex\);)Tj +T* +(releaseMetaLock\(ee, obj, lr | WAITERS\);)Tj +T* +(condvarSignal\(&wakeEE->monitorLockCondvar\);)Tj +T* +(mutexUnlock\(&wakeEE->monitorLockMutex\);)Tj +-1.2 -1.1429 TD +(} else {)Tj +1.2 -1.1429 TD +(releaseMetaLock\(ee, obj, lr | WAITERS\);)Tj +-1.2 -1.1429 TD +(})Tj +-1.2 -1.1429 TD +(})Tj +/N92 1 Tf +12 0 0 12 188.78 239.67 Tm +[(Figure 8. Slo)25 (ws path for monitor)21 (-lock operations)]TJ +ET +PDFVars/TermAll get exec end end +userdict /pgsave get restore +showpage +%%PageTrailer +%%EndPage +%%Page: 16 16 +%%BeginPageSetup +userdict /pgsave save put +PDFVars begin PDF begin PDFVars/InitAll get exec +%%EndPageSetup +0 0 612 792 RC +1 g +/N93 /ExtGState findRes gs +1 i +0 0 612 792 re +f +BT +/N92 1 Tf +12 0 0 12 300 34.17 Tm +0 g +0 Tc +(13)Tj +/N91 1 Tf +14 0 0 14 72 406.17 Tm +0 Tw +[(4.4 W)65 (aiting and notifying)]TJ +/N92 1 Tf +12 0 0 12 72 381.5 Tm +[(Figure)-250 (9)-248 (sho)25 (ws)-247 (the)-247 (remaining)-247 (tw)10 (o)-247 (monitor)-248 (operations:)-247 (w)9 (ait)-247 (and)-248 (notify)65 (.)-248 (The)-247 (Ja)20 (v)25 (a)-247 (language)-247 (speci\336-)]TJ +0 -1.1667 TD +[(cation)-230 (requires)-230 (that)-231 (the)-230 (thread)-230 (performing)-231 (them)-230 (must)-231 (hold)-230 (the)-230 (object\325)55 (s)-230 (monitor)20 (-lock,)-230 (otherwise)-231 (the)]TJ +T* +[(operations)-292 (thro)25 (w)-292 (a)0 (n)-293 (e)15 (xception.)-292 (A)-292 (thread)-293 (w)10 (aits)-293 (by)-292 (acquiring)-292 (the)-292 (meta-lock,)-292 (setting)-292 (the)]TJ +/N95 1 Tf +34.8 0 TD +(isWait-)Tj +-34.8 -1.1667 TD +(ingForNotify)Tj +/N92 1 Tf +7.5242 0 TD +[(\336eld)-324 (in)-324 (its)-323 (EE,)-325 (and)-323 (releasing)-324 (the)-324 (monitor)20 (-lock)-323 (and)-324 (meta-lock)-324 (\(i.e.,)-323 (setting)-325 (the)]TJ +-7.5242 -1.1667 TD +[(lock)-269 (state)-268 (to)]TJ +/N95 1 Tf +5.1383 0 TD +(WAITERS)Tj +/N92 1 Tf +4.2 0 TD +[(\).)-269 (It)-268 (then)-268 (w)10 (aits)-268 (until)-268 (a)-269 (noti\336cation)-268 (operation)-268 (mak)10 (es)-269 (it)-268 (a)-269 (potential)-268 (lock)-268 (con-)]TJ +-9.3383 -1.1667 TD +[(tender)-239 (again,)-239 (and)-239 (some)-239 (monitor)21 (-lock)-239 (release)-239 (operation)-239 (signals)-239 (it)-240 (to)-239 (acti)25 (v)15 (ely)-239 (contend,)-239 (or)-239 (until)-239 (some)]TJ +T* +[(amount)-285 (of)-284 (time)-284 (speci\336ed)-285 (in)-284 (the)-285 (w)10 (ait)-285 (operation)-284 (has)-284 (elapsed.)-285 (A)-285 (notifying)-284 (thread)-285 (similarly)-285 (acquires)]TJ +T* +[(the)-309 (meta-lock.)-309 (Then)-309 (it)-310 (w)10 (alks)-309 (the)-310 (queue)-309 (of)-309 (lock)-309 (records,)-309 (looking)-309 (for)-310 (threads)-309 (w)9 (aiting)-309 (for)-309 (noti\336ca-)]TJ +T* +[(tion\321ones)-324 (whose)]TJ +/N95 1 Tf +7.5933 0 TD +(isWaitingForNotify)Tj +/N92 1 Tf +11.1242 0 TD +[(\336eld)-324 (is)]TJ +/N95 1 Tf +3.0942 0 TD +(TRUE)Tj +/N92 1 Tf +2.4 0 TD +[(\321and)-324 (resetting)-324 (this)-325 (boolean)-324 (to)-324 (indi-)]TJ +-24.2117 -1.1667 TD +[(cate)-282 (that)-282 (the)15 (y)-282 (h)0 (a)20 (v)15 (e)-282 (been)-282 (noti\336ed.)-281 (The)]TJ +/N95 1 Tf +15.3917 0 TD +(notify\(\))Tj +/N92 1 Tf +5.0825 0 TD +[(operation)-281 (\336nds)-282 (the)-281 (\336rst)-282 (such)-282 (thread)-282 (and)-282 (resets)]TJ +-20.4742 -1.1667 TD +[(its)-255 (boolean;)]TJ +/N95 1 Tf +4.8992 0 TD +(notifyAll\(\))Tj +/N92 1 Tf +6.855 0 TD +[(tra)20 (v)15 (erses)-255 (the)-255 (entire)-255 (lock)-255 (queue.)-255 (Finally)65 (,)-255 (the)-255 (notifying)-254 (thread)-255 (releases)]TJ +-11.7542 -1.1667 TD +(the meta-lock.)Tj +0 -1.75 TD +[(Since)-405 (some)-406 (styles)-405 (of)-405 (concurrent)-405 (programming)-405 (result)-405 (in)-406 (a)-405 (high)-405 (frequenc)16 (y)-406 (o)0 (f)-405 (noti\336cations,)-405 (our)]TJ +0 -1.1667 TD +[(implementation)-331 (has)-331 (further)-331 (optimized)-331 (the)-330 (notify)-332 (code)-331 (\(the)-331 (optimization)-331 (is)-331 (not)-331 (sho)25 (wn)-331 (in)-331 (the)-331 (\336g-)]TJ +T* +[(ure\).)-242 (The)-242 (idea)-243 (is)-242 (that)-242 (a)-243 (simple)-242 (read)-242 (of)-242 (the)-243 (multi-use)-242 (w)10 (ord)-243 (most)-242 (of)-243 (the)-242 (time)-242 (suf)25 (\336ces)-242 (to)-242 (grab)-243 (the)-242 (root)]TJ +T* +[(of)-253 (the)-252 (lock)-252 (queue.)-253 (If)-252 (the)-253 (read)-252 (fetches)-252 (a)-253 (w)10 (ord)-253 (in)]TJ +/N95 1 Tf +18.9008 0 TD +(LOCKED)Tj +/N92 1 Tf +3.8525 0 TD +[(state,)-252 (the)-253 (notifying)-252 (thread)-253 (can)-252 (v)15 (erify)-253 (that)]TJ +-22.7533 -1.1667 TD +[(it)-314 (holds)-314 (the)-314 (monitor)20 (-lock)-314 (and)-313 (w)10 (alk)-314 (the)-314 (queue)-314 (without)-314 (holding)-314 (the)-315 (meta-lock.)-313 (The)-314 (correctness)-313 (of)]TJ +T* +[(this)-292 (optimization)-291 (relies)-291 (on)-292 (tw)10 (o)-292 (properties:)-291 (a)-291 (n)0 (e)25 (w)-291 (thread)-291 (w)9 (aiting)-291 (for)-291 (a)-291 (notify)-292 (cannot)-291 (appear)-291 (in)-292 (the)]TJ +T* +[(queue)-249 (\(because)-250 (the)-250 (notifying)-250 (thread)-249 (holds)-250 (the)-250 (monitor)21 (-lock\),)-250 (and)-250 (other)-250 (threads)-249 (that)-250 (join)-250 (the)-250 (queue)]TJ +T* +[(do)-391 (so)-391 (at)-390 (the)-390 (end)-391 (\(so)-390 (the)-391 (queue)-390 (is)-390 (ne)25 (v)15 (e)0 (r)-391 (disconnected\).)-390 (See)-391 (also)-391 (Section)-391 (6.3)-391 (for)-391 (an)-391 (alternati)26 (v)15 (e)]TJ +T* +[(implementation in which notify does no queue w)11 (alking.)]TJ +13.4017 31.5417 TD +[(Figure 9. W)80 (ait and notify code)]TJ +/N95 1 Tf +7 0 0 7 77.5 711.33 Tm +(void monitorWait\(ExecEnv *ee, Object *obj,)Tj +10.2 -1.1429 TD +(java_long millis\) {)Tj +-9 -1.1429 TD +[(BitField)-2400 (r)-4200 (=)0 ( getMetaLock\(ee, obj\);)]TJ +T* +(LockRecord *ownerLR = lockRecord\(r\);)Tj +T* +[(LockState)-1800 (state)-1800 (= lockState\(r\);)]TJ +T* +(if \(state == LOCKED && ownerLR->owner == ee\) {)Tj +1.2 -1.1428 TD +(mutexLock\(&ee->monitorLockMutex\);)Tj +T* +(ee->isWaitingForNotify = TRUE;)Tj +T* +(monitorExitSlow\(ee, obj, ownerLR\);)Tj +T* +(if \(millis == TIMEOUT_INFINITY\))Tj +1.2 -1.1429 TD +(condvarWait\(&ee->monitorLockCondvar,)Tj +7.2 -1.1429 TD +(&ee->monitorLockMutex\);)Tj +-8.4 -1.1429 TD +(else)Tj +1.2 -1.1428 TD +(condvarTimedWait\(&ee->monitorLockCondvar,)Tj +7.2 -1.1429 TD +(&ee->monitorLockMutex, millis\);)Tj +-8.4 -1.1429 TD +(ee->isWaitingForNotify = FALSE;)Tj +T* +(mutexUnlock\(&ee->monitorLockMutex\);)Tj +T* +(r = getMetaLock\(ee, obj\);)Tj +T* +(monitorEnterSlow\(ee, obj, r\);)Tj +-1.2 -1.1429 TD +(} else {)Tj +1.2 -1.1429 TD +(releaseMetaLock\(ee, obj, r\);)Tj +T* +(throwIllegalMonitorStateException\(\);)Tj +-1.2 -1.1428 TD +(})Tj +-1.2 -1.1429 TD +(})Tj +33.9086 26.2857 TD +(void notifyOneOrAll\(ExecEnv *ee, Object *obj,)Tj +12 -1.1429 TD +(bool_t one\) {)Tj +-10.8 -1.1429 TD +[(BitField)-2400 (r)-4200 (=)0 ( getMetaLock\(ee, obj\);)]TJ +T* +(LockRecord *ownerLR = lockRecord\(r\);)Tj +T* +[(LockState)-1800 (state)-1800 (= lockState\(r\);)]TJ +T* +(if \(state == LOCKED && ownerLR->owner == ee\) {)Tj +1.2 -1.1428 TD +(LockRecord *q = ownerLR->queue;)Tj +T* +(while \(q\) {)Tj +1.2 -1.1429 TD +(if \(q->owner->isWaitingForNotify\) {)Tj +1.2 -1.1429 TD +(q->owner->isWaitingForNotify = FALSE;)Tj +T* +(if \(one\) break;)Tj +-1.2 -1.1429 TD +(})Tj +T* +(q = q->queue;)Tj +-1.2 -1.1428 TD +(})Tj +T* +(releaseMetaLock\(ee, obj, r\);)Tj +-1.2 -1.1429 TD +(} else {)Tj +1.2 -1.1428 TD +(releaseMetaLock\(ee, obj, r\);)Tj +T* +(throwIllegalMonitorStateException\(\);)Tj +-1.2 -1.1429 TD +(})Tj +-1.2 -1.1429 TD +(})Tj +0 -2.2857 TD +(void monitorNotify\(ExecEnv *ee, Object *obj\) {)Tj +1.2 -1.1428 TD +(notifyOneOrAll\(ee, obj, TRUE\);)Tj +-1.2 -1.1429 TD +(})Tj +0 -2.2857 TD +(void monitorNotifyAll\(ExecEnv *ee, Object *obj\) {)Tj +1.2 -1.1429 TD +(notifyOneOrAll\(ee, obj, FALSE\);)Tj +-1.2 -1.1429 TD +(})Tj +ET +PDFVars/TermAll get exec end end +userdict /pgsave get restore +showpage +%%PageTrailer +%%EndPage +%%Page: 17 17 +%%BeginPageSetup +userdict /pgsave save put +PDFVars begin PDF begin PDFVars/InitAll get exec +%%EndPageSetup +0 0 612 792 RC +1 g +/N93 /ExtGState findRes gs +1 i +0 0 612 792 re +f +BT +/N92 1 Tf +12 0 0 12 300 34.17 Tm +0 g +0 Tc +(14)Tj +/N91 1 Tf +16 0 0 16 72 709.33 Tm +0 Tw +[(5 Corr)18 (ectness)]TJ +/N92 1 Tf +12 0 0 12 72 686 Tm +[(Belo)25 (w)65 (,)-250 (we)-251 (will)-250 (informally)-249 (ar)18 (gue)-250 (the)-250 (correctness)-250 (of)-250 (the)-250 (meta-lock)-250 (protocol)-250 (by)-250 (sho)25 (wing,)-250 (operation-)]TJ +0 -1.1667 TD +[(ally)65 (,)-247 (that)-246 (it)-246 (guarantees)-246 (mutual)-247 (e)16 (xclusion)-247 (and)-246 (freedom)-247 (from)-246 (lock)10 (out.)-247 (W)40 (ithout)-246 (loss)-246 (of)-246 (generality)65 (,)-247 (w)0 (e)]TJ +T* +[(can)-276 (focus)-276 (our)-277 (attention)-276 (on)-277 (a)-276 (single)-277 (object)-276 (that)-277 (is)-276 (subject)-276 (to)-277 (locking)-276 (by)-277 (the)-276 (meta-lock)-277 (protocol.)-276 (\(A)]TJ +T* +[(formal)-408 (proof)-407 (that)-407 (the)-408 (meta-lock)-407 (guarantees)-408 (mutual)-408 (e)16 (xclusion)-407 (and)-408 (freedom)-408 (from)-408 (lock)10 (out)-408 (uses,)]TJ +T* +[(respecti)25 (v)15 (ely)65 (,)-378 (Lamport\325)55 (s)-379 (method)-378 (of)-379 (inducti)26 (v)15 (e)-379 (assertions)-378 (as)-379 (e)16 (x)14 (empli\336ed)-378 (in)-379 ([15])-378 (and)-378 (the)-379 (Owicki-)]TJ +T* +(Lamport technique of [23].\))Tj +/N91 1 Tf +14 0 0 14 72 583.67 Tm +(5.1 Mutual exclusion)Tj +/N92 1 Tf +12 0 0 12 72 559 Tm +[(Assume)-319 (that)-318 (a)-319 (thread)]TJ +/N95 1 Tf +8.94 0 TD +(T1)Tj +/N92 1 Tf +1.5183 0 TD +[(attempts)-318 (to)-319 (obtain)-319 (the)-319 (meta-lock)-318 (by)-318 (calling)]TJ +/N95 1 Tf +17.8408 0 TD +(getMetaLock\(\))Tj +/N92 1 Tf +7.8 0 TD +[(.)-319 (There)]TJ +-36.0992 -1.1667 TD +[(are)-291 (tw)9 (o)-292 (cases)-292 (to)-292 (consider)-292 (according)-292 (to)-292 (whether)]TJ +/N95 1 Tf +19.2642 0 TD +(T1)Tj +/N92 1 Tf +1.4925 0 TD +[(reads)-292 (a)-292 (non-)]TJ +/N95 1 Tf +4.9708 0 TD +(BUSY)Tj +/N92 1 Tf +2.6917 0 TD +[(or)-292 (a)]TJ +/N95 1 Tf +1.8608 0 TD +(BUSY)Tj +/N92 1 Tf +2.6925 0 TD +[(status)-292 (from)-292 (the)]TJ +-32.9725 -1.1667 TD +[(atomic sw)10 (ap in)]TJ +/N95 1 Tf +6.295 0 TD +(getMetaLock\(\))Tj +/N92 1 Tf +7.8 0 TD +(.)Tj +/N94 1 Tf +-14.095 -1.75 TD +[(Case)-295 (1:)]TJ +/N95 1 Tf +12 0 2.551 12 113.08 510 Tm +(T1)Tj +/N94 1 Tf +12 0 0 12 131.02 510 Tm +[(r)37 (eads)-294 (non-)]TJ +/N95 1 Tf +12 0 2.551 12 182.78 510 Tm +(BUSY)Tj +/N92 1 Tf +12 0 0 12 210.03 510 Tm +[(.)-295 (I)0 (n)-295 (this)-295 (case,)]TJ +/N95 1 Tf +5.68 0 TD +(T1)Tj +/N92 1 Tf +1.495 0 TD +[(has)-295 (the)-295 (meta-lock,)-295 (and)-295 (we)-296 (need)-295 (only)-294 (sho)25 (w)-296 (that)-295 (no)]TJ +-18.6775 -1.1667 TD +[(other)-369 (thread)]TJ +/N95 1 Tf +5.2925 0 TD +(T\325)Tj +/N92 1 Tf +1.5692 0 TD +[(can)-369 (no)25 (w)-370 (obtain)-369 (the)-369 (meta-lock)-369 (before)]TJ +/N95 1 Tf +15.5742 0 TD +(T1)Tj +/N92 1 Tf +1.5692 0 TD +[(has)-369 (released)-369 (the)-369 (meta-lock.)-369 (F)15 (o)0 (r)-369 (this,)]TJ +-24.005 -1.1667 TD +[(observ)15 (e)-269 (that,)-268 (follo)26 (wing)-268 (the)-268 (sw)9 (ap)-268 (by)]TJ +/N95 1 Tf +14.585 0 TD +(T1)Tj +/N92 1 Tf +1.2 0 TD +[(,)-268 (the)-268 (header)-268 (w)10 (ord)-269 (has)-268 (a)]TJ +/N95 1 Tf +9.5683 0 TD +(BUSY)Tj +/N92 1 Tf +2.6683 0 TD +[(status.)-268 (Thus,)-268 (an)15 (y)-268 (thread)]TJ +/N95 1 Tf +9.7783 0 TD +(T\325)Tj +/N92 1 Tf +-37.8 -1.1667 TD +[(that)-247 (tries)-248 (to)-248 (obtain)-248 (the)-247 (meta-lock)-247 (before)]TJ +/N95 1 Tf +16.01 0 TD +(T1)Tj +/N92 1 Tf +1.4475 0 TD +[(has)-248 (released)-248 (the)-248 (meta-lock)-247 (will)-247 (read)-248 (a)]TJ +/N95 1 Tf +15.2858 0 TD +(BUSY)Tj +/N92 1 Tf +2.6475 0 TD +[(status.)-248 (In)]TJ +-35.3908 -1.1667 TD +[(particular)41 (,)-462 (the)-462 (\336rst)-461 (subsequent)-462 (thread)]TJ +/N95 1 Tf +16.0733 0 TD +(T2)Tj +/N92 1 Tf +1.6625 0 TD +[(to)-462 (attempt)-462 (to)-462 (obtain)-462 (the)-462 (meta-lock)-462 (will)-462 (read)-461 (<)]TJ +/N95 1 Tf +19.8142 0 TD +(T1)Tj +/N92 1 Tf +1.2 0 TD +(,)Tj +/N95 1 Tf +-38.75 -1.1667 TD +(BUSY)Tj +/N92 1 Tf +2.4 0 TD +[(>.)-338 (In)-338 (this)-338 (case,)]TJ +/N95 1 Tf +6.4142 0 TD +(T2)Tj +/N92 1 Tf +1.5383 0 TD +[(will)-337 (need)-338 (to)-338 (e)15 (x)15 (ecute)]TJ +/N95 1 Tf +8.5967 0 TD +(getMetaLockSlow\(\))Tj +/N92 1 Tf +10.5383 0 TD +[(to)-338 (obtain)-337 (the)-338 (meta-lock)]TJ +-29.4875 -1.1667 TD +(from)Tj +/N95 1 Tf +2.2392 0 TD +(T1)Tj +/N92 1 Tf +1.2 0 TD +[(,)-295 (its)-294 (predecessor)55 (.)-295 (W)80 (e)-295 (sho)26 (w)-295 (belo)25 (w)-295 (that)]TJ +/N95 1 Tf +15.4867 0 TD +(T2)Tj +/N92 1 Tf +1.495 0 TD +[(will)-295 (be)-295 (unable)-295 (to)-294 (obtain)-295 (the)-295 (meta-lock)-295 (at)-295 (least)]TJ +-20.4208 -1.1667 TD +(until)Tj +/N95 1 Tf +2.1992 0 TD +(T1)Tj +/N92 1 Tf +1.565 0 TD +[(e)15 (x)15 (ecutes)-365 (the)-365 (meta-lock)-365 (release)-365 (code.)-365 (Similarly)65 (,)-365 (a)0 (n)15 (y)-365 (subsequent)-365 (attempts)-365 (to)-365 (obtain)-366 (the)]TJ +-3.7642 -1.1667 TD +[(meta-lock)-402 (\(while)]TJ +/N95 1 Tf +7.3592 0 TD +(T1)Tj +/N92 1 Tf +1.6017 0 TD +[(is)-403 (in)-402 (possession)-403 (of)-402 (the)-402 (meta-lock\))-403 (will)-402 (stall)-402 (with)-403 (each)-402 (thread)-402 (w)9 (aiting)-402 (to)]TJ +-8.9608 -1.1667 TD +[(obtain)-425 (the)-425 (meta-lock)-425 (from)-426 (its)-425 (predecessor)-425 (in)-425 (the)-425 (sequence.)-425 (Thus,)]TJ +/N95 1 Tf +27.7483 0 TD +(T1)Tj +/N92 1 Tf +1.625 0 TD +[(is)-425 (guaranteed)-425 (e)16 (xclusi)25 (v)15 (e)]TJ +-29.3733 -1.1667 TD +(access.)Tj +/N94 1 Tf +0 -1.75 TD +[(Case)-283 (2:)]TJ +/N95 1 Tf +12 0 2.551 12 112.79 349 Tm +(T1)Tj +/N94 1 Tf +12 0 0 12 130.59 349 Tm +[(r)36 (eads)]TJ +/N95 1 Tf +12 0 2.551 12 160.21 349 Tm +(BUSY)Tj +/N92 1 Tf +12 0 0 12 187.46 349 Tm +[(.)-283 (Assume)-284 (that)]TJ +/N95 1 Tf +5.8217 0 TD +(T1)Tj +/N92 1 Tf +1.4833 0 TD +[(reads)-283 (<)]TJ +/N95 1 Tf +2.9575 0 TD +(T0)Tj +/N92 1 Tf +1.2 0 TD +(,)Tj +/N95 1 Tf +0.5333 0 TD +(BUSY)Tj +/N92 1 Tf +2.4 0 TD +[(>)-284 (from)-283 (the)-283 (header)-283 (w)10 (ord,)-283 (where)]TJ +/N95 1 Tf +12.8325 0 TD +(T0)Tj +/N92 1 Tf +1.4825 0 TD +(is)Tj +-38.3325 -1.1667 TD +[(the)-368 (thread)-368 (that)-368 (e)15 (x)15 (ecuted)-367 (the)-368 (atomic)-368 (sw)10 (ap)-368 (immediately)-368 (preceding)]TJ +/N95 1 Tf +26.9883 0 TD +(T1)Tj +/N92 1 Tf +1.2 0 TD +[(.)-368 (I)0 (n)-368 (this)-367 (e)25 (v)15 (ent,)]TJ +/N95 1 Tf +6.3758 0 TD +(T1)Tj +/N92 1 Tf +1.5675 0 TD +[(will)-368 (be)]TJ +-36.1317 -1.1667 TD +[(forced)-256 (to)-256 (e)15 (x)15 (ecute)-256 (the)-255 (meta-lock)-256 (hand-of)25 (f)-256 (protocol)-256 (with)]TJ +/N95 1 Tf +22.1542 0 TD +(T0)Tj +/N92 1 Tf +1.2 0 TD +[(,)-256 (its)-256 (predecessor)55 (.)-256 (This)-256 (will)-256 (happen)-255 (only)]TJ +-23.3542 -1.1667 TD +(after)Tj +/N95 1 Tf +2.0942 0 TD +(T0)Tj +/N92 1 Tf +1.4617 0 TD +[(has)-262 (itself)-262 (obtained)-262 (the)-262 (meta-lock)-262 (and)-262 (subsequently)-262 (attempts)-262 (to)-262 (release)-262 (the)-261 (meta-lock.)-262 (No)]TJ +-3.5558 -1.1667 TD +[(thread)-362 (in)-362 (the)-361 (sequence)-362 (may)-361 (obtain)-362 (the)-361 (meta-lock)-362 (before)-362 (its)-362 (predecessor)-361 (has)-362 (released)-361 (the)-362 (meta-)]TJ +T* +[(lock,)-301 (and)-301 (we)-301 (are)-301 (guaranteed)-301 (mutual)-301 (e)15 (xclusion,)-301 (pro)16 (vided)-301 (the)-300 (meta-lock)-301 (hand-of)25 (f)-301 (w)9 (orks)-300 (correctly)65 (.)]TJ +T* +[(T)80 (o)-314 (complete)-314 (the)-313 (ar)18 (gument,)-314 (we)-314 (shall)-313 (sho)25 (w)-314 (that)-313 (the)-314 (meta-lock)-314 (hand-of)25 (f)-314 (protocol)-313 (does)-314 (not)-314 (allo)25 (w)-314 (a)]TJ +T* +[(successor)-279 (thread)-278 (to)-279 (obtain)-278 (the)-278 (meta-lock)-279 (before)-278 (its)-278 (predecessor)-278 (has)-279 (released)-278 (the)-279 (meta-lock.)-278 (Thus,)]TJ +T* +[(consider)-289 (threads)]TJ +/N95 1 Tf +6.855 0 TD +(T0)Tj +/N92 1 Tf +1.4892 0 TD +[(\(the)-290 (predecessor\))-289 (and)]TJ +/N95 1 Tf +8.9208 0 TD +(T1)Tj +/N92 1 Tf +1.49 0 TD +[(\(the)-289 (successor\))-290 (from)-289 (Case)-289 (2)-290 (abo)16 (v)15 (e\321that)-289 (is)-290 (to)-289 (say)65 (,)]TJ +/N95 1 Tf +-18.755 -1.1667 TD +(T0)Tj +/N92 1 Tf +1.4408 0 TD +[(immediately)-241 (preceded)]TJ +/N95 1 Tf +9.09 0 TD +(T1)Tj +/N92 1 Tf +1.4408 0 TD +[(in)-240 (the)-240 (atomic)-240 (sw)9 (ap)-240 (in)-241 (its)-240 (attempt)-241 (to)-240 (obtain)-241 (the)-240 (meta-lock.)]TJ +/N95 1 Tf +22.8858 0 TD +(T1)Tj +/N92 1 Tf +1.4408 0 TD +[(ha)21 (ving)]TJ +-36.2983 -1.1667 TD +[(read)-271 (<)]TJ +/N95 1 Tf +2.5558 0 TD +(T0)Tj +/N92 1 Tf +1.2 0 TD +(,)Tj +/N95 1 Tf +0.5208 0 TD +(BUSY)Tj +/N92 1 Tf +2.4 0 TD +[(>)-270 (i)0 (s)-270 (forced)-271 (do)25 (wn)-270 (the)]TJ +/N95 1 Tf +8.5567 0 TD +(getMetaLockSlow\(\))Tj +/N92 1 Tf +10.4708 0 TD +[(path.)-270 (The)-271 (\336rst)-271 (thing)-271 (that)]TJ +/N95 1 Tf +9.9925 0 TD +(T1)Tj +/N92 1 Tf +1.47 0 TD +(does)Tj +-37.1667 -1.1667 TD +[(is)-365 (obtain)-365 (a)-365 (lock)-365 (on)]TJ +/N95 1 Tf +8.1583 0 TD +(T0->metaLockMutex)Tj +/N92 1 Tf +10.2 0 TD +[(,)-366 (and)-365 (check)-365 (if)]TJ +/N95 1 Tf +6.0983 0 TD +(T0->bitsForGrab)Tj +/N92 1 Tf +9.365 0 TD +[(is)-365 (set.)-366 (In)-365 (the)]TJ +-33.8217 -1.1667 TD +[(case)-228 (where)]TJ +/N95 1 Tf +4.6208 0 TD +(T0)Tj +/N92 1 Tf +1.4283 0 TD +[(is)-228 (not)-229 (yet)-228 (ready)-228 (to)-229 (release)-228 (the)-228 (meta-lock,)-228 (this)-228 (\336eld)-229 (will)-228 (be)]TJ +/N95 1 Tf +22.8758 0 TD +(FALSE)Tj +/N92 1 Tf +3 0 TD +[(,)-228 (and)]TJ +/N95 1 Tf +2.15 0 TD +(T1)Tj +/N92 1 Tf +1.4283 0 TD +[(will)-229 (w)10 (ait)]TJ +-35.5033 -1.1667 TD +[(to)-466 (be)-467 (signalled)-466 (on)]TJ +/N95 1 Tf +8.1992 0 TD +(T0->metaLockCondvar)Tj +/N92 1 Tf +11.8667 0 TD +(with)Tj +/N95 1 Tf +2.2442 0 TD +(T1->gotMetaLockSlow)Tj +/N92 1 Tf +11.8667 0 TD +[(set.)-466 (W)80 (e)-467 (are)]TJ +-34.1767 -1.1667 TD +[(ensured)-248 (that)]TJ +/N95 1 Tf +5.1067 0 TD +(T1)Tj +/N92 1 Tf +1.4483 0 TD +[(will)-248 (not)-248 (be)-248 (able)-248 (to)-248 (proceed)-248 (further)-248 (until)]TJ +/N95 1 Tf +15.9275 0 TD +(T0)Tj +/N92 1 Tf +1.4483 0 TD +[(is)-248 (ready)-248 (to)-249 (release)-248 (the)-248 (meta-lock,)-248 (thus)]TJ +-23.9308 -1.1667 TD +[(ensuring mutual e)16 (xclusion.)]TJ +/N91 1 Tf +14 0 0 14 72 106.67 Tm +[(5.2 Fr)18 (eedom fr)18 (om lock)15 (out)]TJ +/N92 1 Tf +12 0 0 12 72 82 Tm +[(T)80 (o)-329 (sho)25 (w)-329 (l)0 (i)25 (v)15 (eness,)-329 (we)-330 (assume)-328 (that)-329 (each)-329 (thread)-329 (that)-329 (obtains)-329 (the)-329 (meta-lock)-329 (e)26 (v)15 (entually)-330 (attempts)-328 (to)]TJ +T* +(release the meta-lock.)Tj +ET +PDFVars/TermAll get exec end end +userdict /pgsave get restore +showpage +%%PageTrailer +%%EndPage +%%Page: 18 18 +%%BeginPageSetup +userdict /pgsave save put +PDFVars begin PDF begin PDFVars/InitAll get exec +%%EndPageSetup +0 0 612 792 RC +1 g +/N93 /ExtGState findRes gs +1 i +0 0 612 792 re +f +BT +/N92 1 Tf +12 0 0 12 300 34.17 Tm +0 g +0 Tc +(15)Tj +-19 56.4858 TD +[(Consider)-252 (\336rst)-252 (the)-252 (case)-252 (where)-253 (a)-252 (thread)]TJ +/N95 1 Tf +15.2625 0 TD +(T1)Tj +/N92 1 Tf +1.4525 0 TD +[(attempting)-252 (to)-253 (release)-252 (the)-252 (meta-lock)-252 (e)15 (x)15 (ecutes)-252 (the)]TJ +/N95 1 Tf +19.455 0 TD +(CAS)Tj +/N92 1 Tf +2.0517 0 TD +(in)Tj +/N95 1 Tf +-38.2217 -1.1667 TD +(releaseMetaLock\(\))Tj +/N92 1 Tf +10.4608 0 TD +[(and)-262 (disco)15 (v)15 (ers)-261 (that)-262 (the)-261 (header)-261 (w)10 (ord)-261 (compares)-261 (with)-261 (what)-261 (it)-261 (had)-261 (sw)10 (apped)]TJ +-10.4608 -1.1667 TD +[(in)-247 (when)-246 (it)-246 (had)-247 (obtained)-247 (the)-246 (meta-lock,)-247 (i.e.)-246 (with)-247 (<)]TJ +/N95 1 Tf +19.6425 0 TD +(T1)Tj +/N92 1 Tf +1.2 0 TD +0 Tw +(,)Tj +/N95 1 Tf +0.4967 0 TD +(BUSY)Tj +/N92 1 Tf +2.4 0 TD +[(>.)-247 (This)-247 (means)-247 (that)-247 (no)-247 (other)-247 (thread)-246 (has)]TJ +-23.7392 -1.1667 TD +[(attempted)-276 (to)-275 (obtain)-275 (the)-275 (meta-lock)-276 (since)]TJ +/N95 1 Tf +16.1517 0 TD +(T1)Tj +/N92 1 Tf +1.475 0 TD +[(did)-276 (so.)-275 (The)]TJ +/N95 1 Tf +4.7992 0 TD +(CAS)Tj +/N92 1 Tf +2.075 0 TD +[(completes)-276 (and)-275 (a)-276 (non-)]TJ +/N95 1 Tf +8.6033 0 TD +(BUSY)Tj +/N92 1 Tf +2.675 0 TD +[(status)-276 (is)]TJ +-35.7792 -1.1667 TD +[(written into the header w)10 (ord, thus releasing the meta-lock.)]TJ +0 -1.75 TD +[(Consider)-265 (no)25 (w)-265 (the)-265 (case)-265 (where)-265 (another)-265 (thread)]TJ +/N95 1 Tf +18.0483 0 TD +(T2)Tj +/N92 1 Tf +1.465 0 TD +[(has)-265 (attempted)-265 (to)-265 (acquire)-265 (the)-265 (meta-lock.)-266 (W)81 (e)-265 (ha)20 (v)15 (e)]TJ +-19.5133 -1.1667 TD +[(already)-268 (ar)18 (gued)-267 (abo)15 (v)15 (e)-268 (\(Case)-267 (2)-268 (i)0 (n)-267 (Section)-267 (5.1)-267 (abo)15 (v)15 (e)0 (\))-268 (that)-267 (in)-267 (this)-267 (case)]TJ +/N95 1 Tf +27.4225 0 TD +(T2)Tj +/N92 1 Tf +1.4675 0 TD +[(is)-267 (forced)-268 (do)25 (wn)-268 (the)]TJ +/N95 1 Tf +7.71 0 TD +(get-)Tj +-36.6 -1.1667 TD +(MetaLockSlow\(\))Tj +/N92 1 Tf +8.7008 0 TD +[(path.)-300 (No)25 (w)64 (,)-301 (when)]TJ +/N95 1 Tf +7.1442 0 TD +(T1)Tj +/N92 1 Tf +1.5008 0 TD +[(attempts)-300 (to)-301 (release)-301 (the)-300 (meta-lock,)-300 (the)]TJ +/N95 1 Tf +15.44 0 TD +(CAS)Tj +/N92 1 Tf +2.1008 0 TD +[(w)10 (ould)-300 (f)10 (ail)]TJ +-34.8867 -1.1667 TD +[(because)-397 (the)-397 (header)-397 (w)9 (ord)-397 (contents)-397 (w)9 (ould)-397 (be)-397 (dif)24 (ferent)-397 (from)-397 (<)]TJ +/N95 1 Tf +25.365 0 TD +(T1)Tj +/N92 1 Tf +1.2 0 TD +(,)Tj +/N95 1 Tf +0.6467 0 TD +(BUSY)Tj +/N92 1 Tf +2.4 0 TD +[(>)-398 (written)-397 (by)]TJ +/N95 1 Tf +5.5892 0 TD +(T1)Tj +/N92 1 Tf +1.2 0 TD +0.397 Tc +[(.A)397 (sa)]TJ +-36.4008 -1.1667 TD +0 Tc +(result,)Tj +/N95 1 Tf +2.8558 0 TD +(T1)Tj +/N92 1 Tf +1.5842 0 TD +[(w)10 (ould)-384 (no)25 (w)-384 (b)0 (e)-384 (forced)-384 (do)25 (wn)-384 (the)]TJ +/N95 1 Tf +13.4083 0 TD +(releaseMetaLockSlow\(\))Tj +/N92 1 Tf +12.9842 0 TD +[(path.)-385 (There)-384 (are)-384 (tw)10 (o)]TJ +-30.8325 -1.1667 TD +[(cases)-367 (to)-367 (consider)-367 (according)-367 (to)-367 (whether)]TJ +/N95 1 Tf +16.4208 0 TD +(T1)Tj +/N92 1 Tf +1.5675 0 TD +(or)Tj +/N95 1 Tf +1.2 0 TD +(T2)Tj +/N92 1 Tf +1.5675 0 TD +[(succeed)-367 (in)-367 (locking)]TJ +/N95 1 Tf +8.0442 0 TD +(T0->metaLockMutex)Tj +/N92 1 Tf +-28.8 -1.1667 TD +[(\336rst,)-277 (respecti)25 (v)14 (ely)65 (,)-278 (i)0 (n)]TJ +/N95 1 Tf +8.3942 0 TD +(releaseMetaLockSlow\(\))Tj +/N92 1 Tf +12.8775 0 TD +(and)Tj +/N95 1 Tf +1.7217 0 TD +(getMetaLockSlow\(\))Tj +/N92 1 Tf +10.2008 0 TD +[(.)-277 (But)-277 (\336rst)-278 (note)]TJ +-33.1942 -1.1667 TD +[(that)-474 (the)-475 (initial)-474 (conditions)-474 (ensure)-475 (that)]TJ +/N95 1 Tf +16.18 0 TD +(T1->bitsForGrab)Tj +/N92 1 Tf +9 0 TD +(,)Tj +/N95 1 Tf +0.7242 0 TD +(T2->gotMetaLockSlow)Tj +/N92 1 Tf +11.875 0 TD +(are)Tj +-37.7792 -1.1667 TD +(both)Tj +/N95 1 Tf +2.0283 0 TD +(FALSE)Tj +/N92 1 Tf +3 0 TD +( initially and that)Tj +/N95 1 Tf +7.0558 0 TD +(T1->succEE)Tj +/N92 1 Tf +6 0 TD +( is)Tj +/N95 1 Tf +1.1667 0 TD +(NULL)Tj +/N92 1 Tf +2.4 0 TD +(.)Tj +/N94 1 Tf +-21.6508 -1.75 TD +[(Case)-301 (1:)]TJ +/N95 1 Tf +12 0 2.551 12 113.21 502 Tm +(T1)Tj +/N94 1 Tf +12 0 0 12 131.22 502 Tm +[(loc)20 (ks)]TJ +/N95 1 Tf +12 0 2.551 12 159.24 502 Tm +(T1->metaLockMutex)Tj +/N94 1 Tf +12 0 0 12 285.25 502 Tm +[(\336r)10 (st)]TJ +/N92 1 Tf +1.5458 0 TD +[(.)-301 (I)0 (n)-300 (this)-301 (case,)]TJ +/N95 1 Tf +5.7017 0 TD +(T1)Tj +/N92 1 Tf +1.5 0 TD +[(will)-301 (\336nd)-301 (that)]TJ +/N95 1 Tf +5.5142 0 TD +(T1->succEE)Tj +/N92 1 Tf +6.3008 0 TD +(is)Tj +/N95 1 Tf +-38.3333 -1.1667 TD +(NULL)Tj +/N92 1 Tf +2.4 0 TD +[(,)-374 (s)0 (o)-374 (i)0 (t)-374 (will)-374 (tak)10 (e)-374 (the)-375 (else)-374 (branch)-374 (in)]TJ +/N95 1 Tf +14.5508 0 TD +(releaseMetaLockSlow\(\))Tj +/N92 1 Tf +12.6 0 TD +[(,)-374 (write)-374 (the)]TJ +/N95 1 Tf +4.6492 0 TD +(release-)Tj +-34.2 -1.1667 TD +(Bits)Tj +/N92 1 Tf +2.61 0 TD +(into)Tj +/N95 1 Tf +1.7658 0 TD +(T1->metaLockBits)Tj +/N92 1 Tf +9.6 0 TD +[(,)-210 (set)]TJ +/N95 1 Tf +1.7808 0 TD +(T1->bitsForGrab)Tj +/N92 1 Tf +9.21 0 TD +(to)Tj +/N95 1 Tf +0.9875 0 TD +(TRUE)Tj +/N92 1 Tf +2.4 0 TD +[(,)-210 (and)-210 (w)10 (ait)-210 (on)]TJ +/N95 1 Tf +5.2458 0 TD +(T1->meta-)Tj +-33.6 -1.1667 TD +(LockCondvar)Tj +/N92 1 Tf +7.0217 0 TD +(\(releasing)Tj +/N95 1 Tf +4.3642 0 TD +(T1->metaLockMutex)Tj +/N92 1 Tf +10.2 0 TD +[(\))-421 (for)]TJ +/N95 1 Tf +2.3408 0 TD +(T1->bitsForGrab)Tj +/N92 1 Tf +9.4217 0 TD +[(to)-421 (be)-421 (reset)-421 (to)]TJ +/N95 1 Tf +-33.3483 -1.1667 TD +(FALSE)Tj +/N92 1 Tf +3 0 TD +[(.)-295 (Subsequently)65 (,)]TJ +/N95 1 Tf +6.4133 0 TD +(T2)Tj +/N92 1 Tf +1.495 0 TD +[(will)-295 (succeed)-295 (in)-294 (locking)]TJ +/N95 1 Tf +9.6783 0 TD +(T1->metaLockMutex)Tj +/N92 1 Tf +10.2 0 TD +[(,)-295 (and)-295 (will)-295 (tak)10 (e)-295 (the)-295 (if)]TJ +-30.7867 -1.1667 TD +[(branch)-692 (in)]TJ +/N95 1 Tf +4.8833 0 TD +(getMetaLockSlow\(\))Tj +/N92 1 Tf +10.8925 0 TD +(since)Tj +/N95 1 Tf +2.7467 0 TD +(T1->bitsForGrab)Tj +/N92 1 Tf +9.6925 0 TD +[(is)-692 (set;)-692 (it)-692 (will)-692 (cop)10 (y)-693 (the)]TJ +/N95 1 Tf +-28.215 -1.1667 TD +(releaseBits)Tj +/N92 1 Tf +6.6 0 TD +[(,)-368 (reset)]TJ +/N95 1 Tf +2.875 0 TD +(T1->bitsForGrab)Tj +/N92 1 Tf +9 0 TD +[(,)-368 (signal)]TJ +/N95 1 Tf +3.3758 0 TD +(T1)Tj +/N92 1 Tf +1.5683 0 TD +[(to)-369 (w)9 (a)0 (k)11 (e)-368 (up,)-369 (and)-368 (release)]TJ +/N95 1 Tf +10.1808 0 TD +(T1->meta-)Tj +-33.6 -1.1667 TD +(LockMutex)Tj +/N92 1 Tf +5.4 0 TD +[( allo)25 (wing)]TJ +/N95 1 Tf +3.975 0 TD +(T1)Tj +/N92 1 Tf +1.2 0 TD +( to continue. At this point)Tj +/N95 1 Tf +10.4733 0 TD +(T2)Tj +/N92 1 Tf +1.2 0 TD +[(, w)10 (ould ha)20 (v)15 (e)0 ( the meta-lock.)]TJ +/N94 1 Tf +-22.2483 -1.75 TD +[(Case)-282 (2:)]TJ +/N95 1 Tf +12 0 2.551 12 112.78 383 Tm +(T2)Tj +/N94 1 Tf +12 0 0 12 130.57 383 Tm +[(loc)20 (ks)]TJ +/N95 1 Tf +12 0 2.551 12 158.38 383 Tm +(T1->metaLockMutex)Tj +/N94 1 Tf +12 0 0 12 284.18 383 Tm +[(\336r)10 (st)]TJ +/N92 1 Tf +1.5458 0 TD +[(.)-282 (I)0 (n)-283 (this)-282 (case,)]TJ +/N95 1 Tf +5.63 0 TD +(T2)Tj +/N92 1 Tf +1.4825 0 TD +[(will)-282 (\336nd)-283 (that)]TJ +/N95 1 Tf +5.46 0 TD +(T1->bitsFor-)Tj +-31.8 -1.1667 TD +(Grab)Tj +/N92 1 Tf +2.6433 0 TD +(is)Tj +/N95 1 Tf +0.9108 0 TD +(FALSE)Tj +/N92 1 Tf +3 0 TD +[(,)-243 (s)0 (o)-244 (i)0 (t)-243 (will)-244 (tak)10 (e)-243 (the)-244 (else)-244 (branch)-243 (in)]TJ +/N95 1 Tf +13.3758 0 TD +(getMetaLockSlow\(\))Tj +/N92 1 Tf +10.2 0 TD +[(,)-243 (set)]TJ +/N95 1 Tf +1.8483 0 TD +(T1->succEE)Tj +/N92 1 Tf +6.2433 0 TD +(to)Tj +/N95 1 Tf +-38.2217 -1.1667 TD +(T2)Tj +/N92 1 Tf +1.2 0 TD +[(,)-194 (and)-193 (w)9 (ait)-194 (on)]TJ +/N95 1 Tf +5.1817 0 TD +(T1->metaLockCondvar)Tj +/N92 1 Tf +11.5942 0 TD +(\(releasing)Tj +/N95 1 Tf +4.1375 0 TD +(T1->metaLockMutex)Tj +/N92 1 Tf +10.2 0 TD +[(\))-194 (for)]TJ +/N95 1 Tf +1.8867 0 TD +(T2->got-)Tj +-34.2 -1.1667 TD +(MetaLockSlow)Tj +/N92 1 Tf +7.5467 0 TD +[(to)-346 (be)-347 (set)-346 (by)]TJ +/N95 1 Tf +5.2192 0 TD +(T1)Tj +/N92 1 Tf +1.2 0 TD +[(.)-347 (When)]TJ +/N95 1 Tf +3.3308 0 TD +(T1)Tj +/N92 1 Tf +1.5467 0 TD +[(subsequently)-346 (obtains)]TJ +/N95 1 Tf +8.8042 0 TD +(T1->metaLockMutex)Tj +/N92 1 Tf +10.2 0 TD +0.347 Tc +[(,i)347 (t)]TJ +-37.8475 -1.1667 TD +0 Tc +[(will)-265 (\336nd)]TJ +/N95 1 Tf +3.6417 0 TD +(T1->succEE)Tj +/N92 1 Tf +6.265 0 TD +[(set)-266 (to)]TJ +/N95 1 Tf +2.4192 0 TD +(T2)Tj +/N92 1 Tf +1.2 0 TD +[(,)-265 (s)0 (o)-265 (i)0 (t)-265 (will)-265 (tak)10 (e)-265 (the)-265 (if)-265 (branch)-265 (in)]TJ +/N95 1 Tf +12.6242 0 TD +(releaseMetaLockSlow\(\))Tj +/N92 1 Tf +12.6 0 TD +(,)Tj +-38.75 -1.1667 TD +[(write)-311 (out)-311 (the)]TJ +/N95 1 Tf +5.4875 0 TD +(releaseBits)Tj +/N92 1 Tf +6.9108 0 TD +(into)Tj +/N95 1 Tf +1.8675 0 TD +(T2->metaLockBits)Tj +/N92 1 Tf +9.6 0 TD +[(,)-311 (and)-311 (reset)]TJ +/N95 1 Tf +4.5142 0 TD +(T1->succEE)Tj +/N92 1 Tf +6.3108 0 TD +[(before)-311 (set-)]TJ +-34.6908 -1.1667 TD +(ting)Tj +/N95 1 Tf +1.8792 0 TD +(T2->gotMetaLockSlow)Tj +/N92 1 Tf +11.7225 0 TD +(to)Tj +/N95 1 Tf +1.1008 0 TD +(TRUE)Tj +/N92 1 Tf +2.7233 0 TD +[(and)-323 (signal)]TJ +/N95 1 Tf +4.4783 0 TD +(T2)Tj +/N92 1 Tf +1.5225 0 TD +[(that)-323 (it)-322 (has)-323 (been)-323 (handed)-323 (the)-323 (meta-lock,)]TJ +-23.4267 -1.1667 TD +[(allo)25 (wing)]TJ +/N95 1 Tf +3.725 0 TD +(T2)Tj +/N92 1 Tf +1.2 0 TD +( to continue with the meta-lock.)Tj +-4.925 -1.75 TD +[(Thus,)-387 (e)25 (v)15 (ery)-387 (thread)-387 (that)-387 (attempts)-387 (to)-387 (obtain)-387 (the)-386 (meta-lock)-387 (will)-386 (e)25 (v)15 (entually)-386 (obtain)-387 (the)-387 (meta-lock,)]TJ +0 -1.1667 TD +[(ensuring freedom from lock)11 (out.)]TJ +/N91 1 Tf +16 0 0 16 72 215.33 Tm +(6 Extensions to the basic algorithm)Tj +/N92 1 Tf +12 0 0 12 72 192 Tm +[(In)-239 (this)-238 (section,)-238 (we)-239 (discuss)-238 (e)16 (xtensions)-238 (to)-239 (our)-238 (algorithm,)-238 (related)-239 (to)-238 (management)-238 (of)-239 (lock)-238 (records)-239 (and)]TJ +T* +[(optimization)-250 (of)-249 (cases)-249 (where)-249 (we)-250 (may)-249 (safely)-249 (a)20 (v)20 (oid)-249 (meta-locking)-249 (because)-249 (the)-250 (change)-250 (in)-249 (the)-250 (object\325)55 (s)]TJ +T* +[(lock)-450 (state)-450 (requires)-450 (only)-449 (one)-450 (w)10 (ord)-449 (to)-450 (be)-449 (updated.)-450 (W)80 (e)-450 (also)-450 (demonstrate)-450 (the)-450 (\337e)15 (xibility)-450 (of)-449 (our)]TJ +T* +[(approach)-231 (and)-230 (outline)-230 (ho)25 (w)-231 (t)0 (o)-230 (implement)-231 (it)-230 (on)-231 (hardw)10 (are)-231 (that)-230 (does)-230 (not)-231 (pro)16 (vide)-230 (atomic)]TJ +/N95 1 Tf +33.5058 0 TD +(CAS)Tj +/N92 1 Tf +2.0308 0 TD +(or)Tj +/N95 1 Tf +1.0633 0 TD +(SWAP)Tj +/N92 1 Tf +-36.6 -1.1667 TD +(operations.)Tj +ET +PDFVars/TermAll get exec end end +userdict /pgsave get restore +showpage +%%PageTrailer +%%EndPage +%%Page: 19 19 +%%BeginPageSetup +userdict /pgsave save put +PDFVars begin PDF begin PDFVars/InitAll get exec +%%EndPageSetup +0 0 612 792 RC +1 g +/N93 /ExtGState findRes gs +1 i +0 0 612 792 re +f +BT +/N92 1 Tf +12 0 0 12 300 34.17 Tm +0 g +0 Tc +(16)Tj +/N91 1 Tf +14 0 0 14 72 710.67 Tm +0 Tw +[(6.1 Lock r)18 (ecord allocation)]TJ +/N92 1 Tf +12 0 0 12 72 686 Tm +[(As)-273 (we)-272 (discussed)-273 (in)-273 (Section)-272 (3,)-272 (each)-273 (of)-273 (the)-273 (locking)-272 (schemes)-273 (we)-272 (kno)25 (w)-273 (about,)-273 (including)-273 (the)-272 (present)]TJ +0 -1.1667 TD +[(one,)-302 (at)-301 (least)-301 (occasionally)-302 (allocates)-302 (data)-301 (structures)-301 (related)-301 (to)-302 (locking.)-301 (This)-301 (section)-301 (discusses)-302 (ho)25 (w)]TJ +T* +[(those)-270 (data)-270 (structures)-270 (are)-270 (allocated)-270 (and)-269 (deallocated.)-270 (The)-270 (original)-270 (JDK)-270 (allocates)-269 (monitors)-270 (globally)65 (,)]TJ +T* +[(causing)-353 (serialization)-353 (of)-354 (monitor)-353 (cache)-354 (operations)-353 (and)-353 (resulting)-353 (scalability)-354 (bottlenecks.)-353 (Periodi-)]TJ +T* +[(cally)65 (,)-262 (unused)-261 (monitors)-262 (are)-261 (reclaimed.)-261 (The)-262 (thin)-261 (locks)-261 (scheme)-262 (globally)-262 (allocates)-261 (\322f)10 (at)-261 (locks,)70 (\323)-262 (which)]TJ +T* +(remain allocated for the lifetime of the associated object [3].)Tj +0 -1.75 TD +[(In)-258 (our)-258 (scheme,)-257 (lock)-258 (records)-258 (are)-258 (the)-257 (unit)-258 (of)-258 (allocation.)-257 (Each)-258 (thread)-258 (has)-258 (a)-258 (set)-258 (of)-257 (lock)-258 (records)-258 (for)-258 (its)]TJ +0 -1.1667 TD +[(e)15 (xclusi)25 (v)15 (e)-235 (use,)-234 (link)9 (ed)-235 (together)-234 (in)-235 (a)-235 (free)-235 (list.)-235 (Lock)-236 (records)-234 (on)-236 (a)-235 (thread\325)55 (s)-235 (free)-235 (list)-235 (ha)20 (v)15 (e)-235 (as)-235 (man)15 (y)-236 (\336elds)]TJ +T* +[(as)-266 (possible)-267 (preinitialized:)-266 (the)-266 (o)25 (wner)-266 (\336eld)-267 (points)-266 (to)-267 (the)-266 (o)25 (wning)-267 (thread,)-266 (the)-266 (count)-267 (\336elds)-266 (contains)-266 (1,)]TJ +T* +[(which)-435 (is)-435 (the)-435 (proper)-435 (count)-435 (when)-435 (locks)-435 (are)-435 (\336rst)-435 (acquired,)-435 (and)-434 (the)-435 (queue)-434 (\336eld)-434 (contains)]TJ +/N95 1 Tf +36.6 0 TD +(NULL)Tj +/N92 1 Tf +-36.6 -1.1667 TD +[(because)-268 (uncontended)-268 (locking)-268 (is)-268 (most)-268 (frequent)-269 (\(separation)-268 (of)-268 (the)-269 (free-list)-268 (link)-268 (and)-268 (the)-268 (queue)-269 (link,)]TJ +T* +[(see)-267 (Figure)-250 (4,)-267 (allo)25 (ws)-266 (the)-267 (queue)-267 (\336eld)-266 (to)-267 (be)-267 (preset)-267 (to)]TJ +/N95 1 Tf +20.4183 0 TD +(NULL)Tj +/N92 1 Tf +2.4 0 TD +[(\).)-267 (Lock)-267 (record)-267 (allocation)-267 (is)-266 (optimized)-266 (to)]TJ +-22.8183 -1.1667 TD +[(a)20 (v)20 (oid)-234 (an)16 (y)-234 (test)-233 (for)-233 (an)-233 (empty)-233 (free)-233 (list;)-234 (instead,)-233 (an)-233 (attempt)-233 (to)-234 (dereference)-233 (a)]TJ +/N95 1 Tf +28.8467 0 TD +(NULL)Tj +/N92 1 Tf +2.6333 0 TD +[(pointer)-234 (generates)-233 (a)]TJ +-31.48 -1.1667 TD +[(signal.)-343 (The)-344 (signal)-343 (handler)-343 (recognizes)-344 (the)-344 (situation,)-343 (re\336lls)-344 (the)-344 (thread\325)55 (s)-344 (lock)-344 (record)-343 (free)-343 (list,)-344 (and)]TJ +T* +[(retries)-317 (the)-317 (operation.)-318 (Threads)-316 (start)-317 (with)-318 (8)-317 (free)-318 (lock)-317 (records)-317 (and)-318 (add)-317 (an)-318 (e)16 (xponentially)-317 (increasing)]TJ +T* +[(number each time the)16 (y e)15 (xhaust the free list.)]TJ +0 -1.75 TD +[(When)-273 (a)-274 (thread)-273 (unlocks)-273 (an)-273 (object,)-273 (the)-273 (lock)-273 (record)-273 (used)-274 (by)-273 (the)-273 (thread)-273 (to)-273 (accomplish)-273 (the)-274 (locking)-272 (is)]TJ +0 -1.1667 TD +[(returned)-233 (to)-234 (the)-233 (thread\325)55 (s)-233 (free)-233 (list.)-234 (In)-233 (our)-233 (current)-233 (implementation,)-233 (the)-233 (set)-233 (of)-233 (lock)-233 (records)-234 (allocated)-232 (to)]TJ +T* +[(a)-234 (g)0 (i)25 (v)15 (en)-235 (thread)-234 (only)-234 (gro)25 (ws;)-235 (there)-234 (is)-235 (no)-235 (pro)15 (vision)-234 (for)-235 (remo)15 (ving)-234 (lock)-235 (records)-234 (from)-234 (a)-235 (thread\325)55 (s)-235 (free)-234 (list)]TJ +T* +[(if)-264 (the)-264 (thread)-263 (brie\337y)-264 (locks)-264 (man)15 (y)-264 (objects,)-264 (b)21 (u)0 (t)-264 (usually)-263 (locks)-264 (fe)25 (w)65 (.)-263 (This)-264 (is)-264 (not)-264 (so)-264 (bad;)-264 (the)-264 (\322high)-264 (w)10 (ater)]TJ +T* +[(mark\323)-321 (of)-320 (allocated)-321 (lock)-320 (records)-321 (is)-320 (limited)-320 (by)-321 (the)-321 (product)-320 (of)-321 (the)-320 (thread)-321 (stack)-321 (size)-321 (and)-320 (the)-320 (maxi-)]TJ +T* +[(mum)-245 (le)15 (xical)-245 (nesting)-245 (depth)-245 (of)-245 (synchronized)-244 (statements)-245 (\(at)-245 (least)-245 (for)-246 (bytecode)-245 (created)-245 (by)-245 (compiling)]TJ +T* +[(Ja)20 (v)25 (a)-373 (language)-373 (source)-372 (code,)-374 (as)-373 (discussed)-373 (in)-373 (Section)-373 (2.2\).)-374 (If)-372 (we)-373 (wished)-374 (to)-373 (add)-373 (a)-374 (mechanism)-372 (to)]TJ +T* +[(return)-238 (lock)-239 (records)-238 (on)-238 (free)-238 (lists)-238 (to)-238 (the)-238 (global)-238 (memory)-238 (pool,)-238 (it)-238 (w)9 (ould)-238 (be)-238 (a)-238 (simple)-239 (matter)-237 (to)-239 (do)-238 (so)-238 (as)]TJ +T* +[(part)-257 (of)-257 (garbage)-257 (collection,)-257 (as)-258 (long)-257 (as)-258 (we)-257 (can)-258 (guarantee)-257 (that)-257 (no)-257 (thread)-258 (is)-257 (accessing)-257 (the)-258 (lock)-257 (record)]TJ +T* +[(free)-261 (list)-262 (during)-261 (garbage)-262 (collection.)-261 (Our)-262 (system)-261 (has)-261 (a)-261 (general)-261 (mechanism)-261 (for)-261 (restricting)-261 (when)-261 (gar-)]TJ +T* +[(bage collection occurs that can be used to pro)16 (vide this guarantee.)]TJ +/N91 1 Tf +14 0 0 14 72 275.67 Tm +(6.2 Extra fast locking and unlocking of uncontended objects)Tj +/N92 1 Tf +12 0 0 12 72 251 Tm +[(W)80 (e)-238 (can)-238 (optimize)-238 (the)-238 (algorithm)-238 (further)-238 (in)-239 (the)-238 (case)-238 (of)-239 (uncontended)-238 (objects.)-238 (This)-239 (optimization)-238 (fuses)]TJ +T* +[(the)-390 (meta-lock)-389 (and)-389 (monitor)21 (-lock)-390 (operations)-389 (into)-390 (a)-389 (single)-389 (step.)-390 (W)40 (ith)-389 (this)-390 (optimization,)-389 (a)-389 (thread)]TJ +T* +[(attempting)-293 (to)-294 (lock)-293 (an)-294 (object)-293 (reads)-293 (the)-294 (object\325)55 (s)-293 (multi-use)-293 (w)10 (ord.)-294 (If)-294 (the)-293 (object\325)55 (s)-293 (lock)-293 (state)-294 (is)]TJ +/N95 1 Tf +36.6 0 TD +(NEU-)Tj +-36.6 -1.1667 TD +(TRAL)Tj +/N92 1 Tf +2.4 0 TD +[(,)-276 (then)-276 (an)-276 (\322e)16 (xtra)-277 (f)11 (ast\323)-276 (path)-276 (is)-276 (tried.)-275 (The)-277 (thread)-276 (copies)-276 (the)-275 (hash)-276 (and)-276 (age)-276 (bits)-276 (into)-276 (a)-276 (fresh)-276 (lock)]TJ +-2.4 -1.1667 TD +[(record)-233 (and)-233 (b)20 (uilds)-234 (a)-233 (n)0 (e)25 (w)-234 (multi-use)-233 (v)25 (alue)-233 (containing)-234 (the)-234 (lock)-234 (record)-233 (address)-233 (and)-233 (the)]TJ +/N95 1 Tf +33.0833 0 TD +(LOCKED)Tj +/N92 1 Tf +3.8333 0 TD +(state.)Tj +-36.9167 -1.1667 TD +(A)Tj +/N95 1 Tf +0.99 0 TD +(CAS)Tj +/N92 1 Tf +2.0675 0 TD +[(instruction)-268 (is)-268 (then)-267 (used)-268 (to)-268 (atomically)-267 (change)-268 (the)-268 (multi-use)-268 (w)10 (ord)-268 (to)-268 (the)-268 (ne)26 (w)-268 (v)25 (alue)-268 (if)-267 (it)-267 (has)]TJ +-3.0575 -1.1667 TD +[(not)-279 (changed)-279 (since)-280 (it)-280 (w)10 (a)0 (s)-280 (read.)-280 (If)-280 (the)]TJ +/N95 1 Tf +14.8633 0 TD +(CAS)Tj +/N92 1 Tf +2.08 0 TD +[(succeeds,)-279 (then)-280 (the)-280 (object)-280 (is)-280 (lock)10 (ed;)-280 (otherwise,)-280 (the)-280 (nor-)]TJ +-16.9433 -1.1667 TD +[(mal)-249 (meta-locking)-249 (protocol)-248 (is)-250 (used.)-249 (W)40 (ith)-249 (this)-249 (optimization,)-249 (the)-249 (e)15 (xtra)-248 (f)10 (ast)-249 (path)-249 (for)-249 (locking)-249 (uses)-249 (one)]TJ +T* +[(atomic)-292 (instruction)-291 (rather)-292 (than)-292 (the)-291 (tw)10 (o)-292 (needed)-292 (for)-291 (meta-locking)-292 (and)-292 (meta-unlocking)-292 (and)-292 (the)-291 (total)]TJ +T* +[(number of instructions is smaller \(15 SP)93 (ARC\252 instructions\).)]TJ +0 -1.75 TD +[(A)-287 (similar)-287 (e)15 (xtra)-287 (f)10 (ast)-286 (path)-287 (for)-287 (unlocking)-286 (is)-287 (slightly)-286 (more)-287 (complicated.)-287 (When)-287 (the)-287 (e)16 (xtra)-287 (f)11 (ast)-287 (locking)]TJ +0 -1.1667 TD +[(path)-230 (succeeds,)-230 (the)-230 (only)-229 (lock)-230 (record)-229 (in)-230 (its)-229 (queue)-230 (is)-230 (that)-230 (of)-229 (the)-230 (locking)-229 (thread;)-230 (the)-230 (queue)-229 (\336eld)-230 (of)-229 (that)]TJ +T* +[(lock)-225 (record)-225 (is)]TJ +/N95 1 Tf +5.6192 0 TD +(NULL)Tj +/N92 1 Tf +2.4 0 TD +[(.)-225 (Another)-225 (thread)-225 (may)-225 (add)-225 (a)-225 (lock)-225 (record)-225 (to)-225 (the)-225 (queue,)-226 (changing)-225 (this)-225 (queue)-225 (\336eld)]TJ +ET +PDFVars/TermAll get exec end end +userdict /pgsave get restore +showpage +%%PageTrailer +%%EndPage +%%Page: 20 20 +%%BeginPageSetup +userdict /pgsave save put +PDFVars begin PDF begin PDFVars/InitAll get exec +%%EndPageSetup +0 0 612 792 RC +1 g +/N93 /ExtGState findRes gs +1 i +0 0 612 792 re +f +BT +/N92 1 Tf +12 0 0 12 300 34.17 Tm +0 g +0 Tc +(17)Tj +-19 56.4858 TD +[(at)-331 (an)15 (y)-332 (time.)-331 (So)-331 (the)-331 (e)15 (xtra)-332 (f)10 (ast)-331 (unlocking)-331 (path)-332 (must)-331 (atomically)-332 (change)-331 (the)-331 (multi-use)-331 (w)9 (ord)-331 (of)-332 (the)]TJ +0 -1.1667 TD +[(object)-363 (back)-362 (to)-363 (its)-362 (original)-362 (contents,)-363 (b)20 (u)0 (t)-363 (only)-362 (if)-362 (the)-363 (queue)-363 (\336eld)-362 (of)-363 (the)-362 (\336rst)-362 (lock)-363 (record)-363 (remains)]TJ +/N95 1 Tf +T* +(NULL)Tj +/N92 1 Tf +2.4 0 TD +[(.)-258 (Unfortunately)65 (,)-259 (this)-258 (\322double-compare-and-sw)10 (ap\323)-258 (operation)-259 (is)-259 (not)-259 (supported)-258 (in)-259 (man)15 (y)-259 (archi-)]TJ +-2.4 -1.1667 TD +[(tectures)-266 (\(though)-265 (it)-266 (is)-265 (not)-265 (completely)-266 (unheard)-266 (of;)-265 (see)-265 ([10]\).)-266 (T)80 (o)-266 (get)-265 (around)-265 (this,)-265 (we)-266 (add)-265 (a)-266 (n)0 (e)25 (w)-265 (con-)]TJ +T* +[(straint)-322 (to)-322 (the)-322 (slo)25 (w)-322 (path.)-322 (W)80 (e)-323 (require)-322 (that)-322 (lock)-322 (records)-323 (be)-322 (allocated)-322 (with)-322 (eight-byte)-321 (alignment,)-322 (so)]TJ +T* +[(that)-229 (three)-229 (bits)-228 (are)-230 (zero)-229 (in)-229 (the)-230 (address)-228 (of)-229 (a)-228 (lock)-230 (record.)-228 (In)-229 (the)]TJ +/N95 1 Tf +23.95 0 TD +(LOCKED)Tj +/N92 1 Tf +3.8292 0 TD +[(state,)-229 (this)-229 (e)16 (xtra)-229 (bit)-230 (is)-229 (used)-229 (to)]TJ +-27.7792 -1.1667 TD +[(summarize)-243 (the)-243 (state)-242 (of)-243 (the)-243 (queue)-243 (\336eld)-243 (of)-243 (the)-243 (\336rst)-242 (lock)-243 (record:)-243 (we)-242 (maintain)-242 (the)-243 (in)40 (v)25 (ariant)-242 (that)-243 (when)]TJ +T* +[(the)-307 (bit)-307 (is)-307 (0,)-307 (the)-307 (queue)-307 (\336eld)-307 (is)]TJ +/N95 1 Tf +12.2075 0 TD +(NULL)Tj +/N92 1 Tf +2.4 0 TD +[(.)-308 (I)0 (f)-307 (the)-307 (bit)-307 (is)-307 (1,)-307 (the)-307 (queue)-307 (may)-307 (be)-308 (non-)]TJ +/N95 1 Tf +15.7925 0 TD +(NULL)Tj +/N92 1 Tf +2.4 0 TD +[(.)-307 (Thus,)-307 (the)-307 (\336rst)]TJ +-32.8 -1.1667 TD +[(thread)-303 (to)-303 (enqueue)-303 (a)-303 (lock)-302 (record)-303 (after)-303 (the)-302 (initial)-303 (one)-303 (is)-303 (required)-303 (to)-303 (set)-302 (this)-303 (bit)-303 (when)-302 (releasing)-303 (the)]TJ +T* +[(meta-lock.)-247 (Once)-247 (this)-247 (in)40 (v)25 (ariant)-247 (may)-247 (be)-248 (assumed,)-247 (we)-247 (can)-248 (construct)-247 (an)-247 (e)15 (xtra)-247 (f)10 (ast)-247 (unlock)-247 (path:)-247 (check)]TJ +T* +[(the)-261 (locking)-262 (depth,)-261 (decrementing)-262 (it)-261 (and)-261 (returning)-262 (if)-261 (it)-261 (is)-261 (greater)-261 (than)-261 (one.)-262 (Otherwise,)-261 (construct)-262 (the)]TJ +T* +[(e)15 (xpected)-357 (current)-357 (v)25 (alue)-357 (of)-358 (the)-357 (multi-use)-358 (w)10 (ord)-358 (\(pointer)-357 (to)-357 (same)-357 (lock)-358 (record,)-357 (queue)-358 (\336eld)-358 (bit)-357 (still)]TJ +T* +[(clear)40 (,)]TJ +/N95 1 Tf +2.5008 0 TD +(LOCKED)Tj +/N92 1 Tf +3.9475 0 TD +[(state\),)-347 (and)-348 (the)-347 (desired)-348 (ne)25 (w)-347 (v)25 (alue)-347 (\(original)-347 (multi-use)-348 (bits,)]TJ +/N95 1 Tf +23.7967 0 TD +(NEUTRAL)Tj +/N92 1 Tf +4.5475 0 TD +[(state\),)-347 (and)]TJ +-34.7925 -1.1667 TD +[(perform)-257 (a)]TJ +/N95 1 Tf +4.1792 0 TD +(CAS)Tj +/N92 1 Tf +2.0575 0 TD +[(instruction)-257 (to)-257 (write)-257 (the)-257 (ne)25 (w)-257 (v)25 (alue)-257 (if)-256 (the)-257 (current)-257 (v)25 (alue)-257 (is)-257 (still)-257 (the)-257 (e)15 (xpected)-257 (v)25 (alue.)-257 (If)]TJ +-6.2367 -1.1667 TD +[(no)-332 (other)-331 (thread)-332 (has)-331 (enqueued)-331 (a)-332 (lock)-331 (record,)-331 (then)-331 (the)]TJ +/N95 1 Tf +21.9467 0 TD +(CAS)Tj +/N92 1 Tf +2.1317 0 TD +[(succeeds)-331 (and)-332 (the)-331 (object)-331 (is)-331 (unlock)10 (ed;)]TJ +-24.0783 -1.1667 TD +[(otherwise,)-316 (we)-316 (re)25 (v)15 (ert)-316 (to)-316 (the)-315 (normal)-316 (meta-locking)-315 (protocol.)-316 (If)-316 (recursi)25 (v)15 (e)-316 (locking)-316 (were)-316 (found)-315 (to)-316 (be)]TJ +T* +[(v)15 (ery)-231 (rare,)-231 (this)-232 (proposal)-231 (could)-231 (be)-231 (e)15 (xtended)-231 (to)-231 (also)-231 (summarize)-231 (the)-231 (lock)-231 (count)-231 (in)-231 (the)-231 (e)16 (xtra)-232 (bit,)-231 (so)-231 (that)]TJ +T* +[(a)-236 (zero)-236 (bit)-236 (observ)14 (ed)-236 (by)-236 (an)-235 (unlocking)-236 (thread)-236 (implied)-236 (both)-235 (a)]TJ +/N95 1 Tf +23.1317 0 TD +(NULL)Tj +/N92 1 Tf +2.6358 0 TD +[(queue)-236 (\336eld)-236 (and)-235 (a)-236 (lock)-236 (count)-235 (of)-236 (1,)]TJ +-25.7675 -1.1667 TD +0 Tw +[(eliminating the e)16 (xplicit test for recursion.)]TJ +0 -1.75 TD +[(The)-287 (instruction)-286 (count)-286 (of)-286 (the)-287 (e)16 (xtra)-287 (f)10 (ast)-286 (unlocking)-286 (sequence)-286 (is)-286 (similar)-287 (to)-286 (that)-287 (of)-286 (e)15 (xtra)-287 (f)10 (ast)-286 (locking,)]TJ +0 -1.1667 TD +[(and)-376 (both)-376 (use)-376 (a)-376 (single)-376 (atomic)-376 (instruction.)-376 (The)-376 (thin)-376 (locks)-376 (scheme)-376 (uses)-376 (no)-376 (atomic)-376 (instruction)-375 (in)]TJ +T* +[(unlocking,)-300 (b)20 (ut,)-299 (as)-299 (we)-300 (ha)20 (v)15 (e)-299 (discussed,)-299 (pays)-299 (for)-300 (that)-300 (lack)-300 (with)-299 (the)-300 (possibility)-299 (of)-299 (unbounded)-300 (b)20 (usy-)]TJ +T* +[(w)10 (aiting.)-284 (W)80 (e)-284 (feel)-285 (that)-284 (in)-284 (man)15 (y)-285 (situations)-284 (the)-285 (trade-of)25 (fs)-285 (made)-284 (in)-284 (our)-284 (algorithm)-284 (will)-285 (be)-284 (more)-285 (desir-)]TJ +T* +(able.)Tj +/N91 1 Tf +14 0 0 14 72 350.67 Tm +(6.3 Flexibility)Tj +/N92 1 Tf +12 0 0 12 72 326 Tm +[(One)-346 (of)-346 (the)-345 (main)-346 (adv)25 (antages)-345 (we)-346 (ha)20 (v)16 (e)-346 (claimed)-346 (for)-346 (the)-345 (meta-locking)-345 (approach)-346 (is)-345 (\337e)15 (xibility)65 (.)-346 (This)]TJ +T* +[(\337e)15 (xibility)-309 (results)-309 (from)-309 (the)-309 (f)10 (act)-309 (that)-309 (we)-309 (place)-309 (fe)25 (w)-310 (constraints)-308 (on)-309 (the)-310 (nature)-308 (of)-309 (the)-309 (data)-309 (structures)]TJ +T* +[(protected)-289 (by)-289 (the)-290 (meta-lock.)-288 (Speci\336cally)65 (,)-289 (i)0 (t)-289 (enables)-289 (separation)-289 (of)-289 (mechanism)-289 (and)-289 (polic)15 (y)65 (,)-289 (to)-289 (allo)25 (w)]TJ +T* +[(implementation of a v)26 (ariety of monitor semantics.)]TJ +0 -1.75 TD +[(W)80 (e)-301 (ha)20 (v)15 (e)-302 (tested)-301 (this)-302 (\337e)15 (xibility)-302 (claim)-301 (to)-301 (some)-301 (e)15 (xtent)-302 (in)-301 (an)-302 (attempt)-302 (to)-301 (address)-302 (tw)10 (o)-302 (potential)-301 (short-)]TJ +0 -1.1667 TD +[(comings)-459 (of)-459 (our)-459 (simple)-460 (link)10 (ed-list)-459 (data)-459 (structure:)-459 (lack)-459 (of)-459 (f)11 (airness)-460 (and)-459 (long)-459 (searches)-459 (through)]TJ +T* +[(queues.)-232 (First,)-232 (consider)-233 (f)11 (airness.)-232 (Moti)25 (v)25 (ated)-232 (by)-232 (Buhr)]TJ +/N94 1 Tf +20.535 0 TD +[(et)-233 (al.,)]TJ +/N92 1 Tf +2.465 0 TD +[(who)-232 (classify)-232 (and)-233 (compare)-232 (a)-233 (spectrum)-231 (of)]TJ +-23 -1.1667 TD +[(monitor)-399 (\322styles\323)-399 (that)-399 (of)25 (fer)-399 (dif)25 (ferent)-400 (trade-of)25 (fs)-399 (between)-399 (performance)-399 (and)-399 (f)11 (airness)-400 ([7],)-399 (we)-400 (pro-)]TJ +T* +[(grammed)-338 (a)-338 (v)15 (ersion)-338 (that)-338 (gi)26 (v)15 (e)0 (s)-339 (preference)-338 (to)-338 (a)15 (w)10 (ak)10 (ened)-339 (w)10 (aiters)-338 (\(so-called)-338 (\322priority)-337 (non-blocking)]TJ +T* +[(monitors\323\).)-309 (T)80 (o)-309 (pro)16 (vide)-310 (this)-309 (preference,)-309 (we)-309 (replaced)-309 (the)-309 (single)-309 (queue)-309 (with)-309 (three)-309 (queues,)-309 (holding)]TJ +T* +[(entering,)-294 (w)10 (aiting,)-294 (and)-293 (a)15 (w)10 (ak)10 (ened)-294 (threads,)-293 (respecti)25 (v)15 (ely)65 (.)-293 (N)0 (o)25 (w)-294 (it)-294 (is)-294 (possible)-293 (to)-294 (\336nd)-293 (and)-293 (gi)25 (v)15 (e)-293 (prefer-)]TJ +T* +[(ence)-263 (to)-262 (a)15 (w)9 (ak)11 (ened)-263 (w)9 (aiters)-262 (without)-262 (searching.)-263 (Similarly)65 (,)]TJ +/N95 1 Tf +22.3358 0 TD +(notify\(\))Tj +/N92 1 Tf +5.0625 0 TD +[(can)-263 (e)15 (x)15 (ecute)-262 (in)-263 (constant)-262 (time,)]TJ +-27.3983 -1.1667 TD +[(by)-278 (mo)15 (ving)-279 (the)-278 (\336rst)-279 (thread)-278 (from)-278 (the)-279 (w)10 (aiting)-279 (queue)-279 (to)-278 (the)-279 (a)15 (w)10 (ak)10 (ened)-279 (queue.)-279 (Second,)-278 (consider)-278 (con-)]TJ +T* +[(tention.)-270 (T)79 (o)-271 (allo)25 (w)-270 (threads)-271 (to)-270 (append)-270 (lock)-271 (records)-270 (to)-271 (queues)-270 (without)-270 (ha)20 (ving)-271 (to)-270 (search)-271 (to)-270 (the)-271 (end)-270 (of)]TJ +T* +[(the)-272 (queue,)-273 (a)-272 (search)-273 (which)-272 (could)-272 (become)-272 (costly)-273 (if)-272 (queues)-272 (get)-272 (long,)-272 (we)-272 (k)10 (ept)-272 (head)-273 (and)-272 (tail)-272 (pointers)]TJ +T* +[(for)-367 (each)-367 (of)-368 (the)-367 (three)-367 (queues.)-368 (T)80 (ail)-367 (pointers)-367 (also)-367 (allo)25 (w)]TJ +/N95 1 Tf +22.3133 0 TD +(notifyAll\(\))Tj +/N92 1 Tf +6.9675 0 TD +[(to)-367 (run)-368 (in)-367 (constant)-367 (time,)]TJ +-29.2808 -1.1667 TD +[(re)15 (gardless)-305 (of)-304 (the)-305 (number)-305 (of)-304 (threads)-305 (w)10 (aiting,)-305 (via)-305 (list)-305 (concatenation.)-305 (While)-305 (this)-305 (alternati)26 (v)15 (e)-305 (imple-)]TJ +T* +[(mentation)-286 (w)9 (a)0 (s)-287 (straightforw)11 (ard,)-287 (performance)-287 (turned)-287 (out)-287 (to)-287 (be)-287 (inferior)-287 (to)-287 (our)-287 (single-queue)-286 (system)]TJ +ET +PDFVars/TermAll get exec end end +userdict /pgsave get restore +showpage +%%PageTrailer +%%EndPage +%%Page: 21 21 +%%BeginPageSetup +userdict /pgsave save put +PDFVars begin PDF begin PDFVars/InitAll get exec +%%EndPageSetup +0 0 612 792 RC +1 g +/N93 /ExtGState findRes gs +1 i +0 0 612 792 re +f +BT +/N92 1 Tf +12 0 0 12 300 34.17 Tm +0 g +0 Tc +(18)Tj +-19 56.4858 TD +[(because)-349 (greater)-349 (f)11 (airness)-350 (incurs)-348 (a)-349 (higher)-349 (conte)15 (xt)-349 (switch)-349 (rate)-349 (and)-348 (the)-350 (three-queue)-348 (data)-349 (structures)]TJ +0 -1.1667 TD +0 Tw +[(were more hea)20 (vyweight.)]TJ +/N91 1 Tf +14 0 0 14 72 665.67 Tm +[(6.4 Hard)15 (war)18 (e without)]TJ +/N98 1 Tf +9.8829 0 TD +(SWAP)Tj +/N91 1 Tf +2.4 0 TD +( or)Tj +/N98 1 Tf +1.4443 0 TD +(CAS)Tj +/N92 1 Tf +12 0 0 12 72 641 Tm +[(The)-230 (meta-lock)-229 (algorithm)-230 (relies)-230 (on)-230 (tw)10 (o)-230 (\322)0 (e)16 (xotic\323)-230 (atomic)-230 (operations:)]TJ +/N95 1 Tf +26.6508 0 TD +(SWAP)Tj +/N92 1 Tf +2.63 0 TD +(and)Tj +/N95 1 Tf +1.6733 0 TD +(CAS)Tj +/N92 1 Tf +1.8 0 TD +[(.)-230 (First,)-230 (note)-230 (that)]TJ +/N95 1 Tf +-32.7542 -1.1667 TD +(SWAP)Tj +/N92 1 Tf +2.6483 0 TD +[(is)-248 (easily)-249 (simulated)-248 (using)]TJ +/N95 1 Tf +10.0492 0 TD +(CAS)Tj +/N92 1 Tf +1.8 0 TD +[(:)-249 (repeatedly)-247 (read)-249 (the)-248 (memory)-248 (location)-249 (and)]TJ +/N95 1 Tf +17.1233 0 TD +(CAS)Tj +/N92 1 Tf +2.0475 0 TD +[(until)-248 (success.)]TJ +-33.6683 -1.1667 TD +(The)Tj +/N95 1 Tf +1.9692 0 TD +(CAS)Tj +/N92 1 Tf +2.2142 0 TD +[(operation,)-414 (or)-414 (some)-414 (other)-414 (suf)25 (\336ciently)-413 (po)25 (werful)-414 (primiti)25 (v)15 (e)-414 (such)-414 (as)-414 (\322load-lock)10 (ed/store-)]TJ +-4.1833 -1.1667 TD +[(conditional,)70 (\323)-423 (seem)-422 (to)-423 (be)-422 (a)20 (v)25 (ailable)-423 (on)-422 (most)-422 (modern)-422 (architectures,)-422 (including)-422 (mainstream)-422 (Intel,)]TJ +T* +[(UltraSP)92 (ARC\252, Po)26 (werPC, and Alpha microprocessors.)]TJ +0 -1.75 TD +[(The)-296 (JVM)-296 (in)-295 (which)-296 (we)-296 (implemented)-296 (our)-295 (synchronization)-296 (must)-296 (run)-295 (on)-296 (the)-296 (pre)25 (vious)-295 (generation)-295 (of)]TJ +0 -1.1667 TD +[(SP)92 (ARC)-314 (processors,)-314 (which)-314 (has)]TJ +/N95 1 Tf +12.58 0 TD +(SWAP)Tj +/N92 1 Tf +2.7142 0 TD +[(b)20 (u)0 (t)-314 (does)-314 (not)-314 (ha)20 (v)16 (e)]TJ +/N95 1 Tf +7.4783 0 TD +(CAS)Tj +/N92 1 Tf +1.8 0 TD +[(.)-314 (While)-314 (correctness)-313 (cannot)-314 (be)-314 (com-)]TJ +-24.5725 -1.1667 TD +[(promised,)-251 (it)-251 (w)10 (a)0 (s)-251 (deemed)-251 (acceptable)-251 (to)-251 (trade)-252 (a)16 (w)9 (ay)-251 (some)-251 (performance)-251 (and)-251 (scalability)-250 (on)-252 (this)-251 (older)]TJ +T* +[(hardw)10 (are.)-322 (W)80 (e)-323 (\336rst)-323 (dropped)-322 (the)-323 (e)15 (xtra)-322 (f)10 (ast)-322 (synchronization)-322 (optimization)-323 (because)-322 (it)-323 (relies)-322 (directly)]TJ +T* +(on)Tj +/N95 1 Tf +1.2633 0 TD +(CAS)Tj +/N92 1 Tf +1.8 0 TD +[(.)-263 (N)0 (e)15 (xt,)-263 (we)-263 (modi\336ed)]TJ +/N95 1 Tf +8.2033 0 TD +(getMetaLock\(\))Tj +/N92 1 Tf +8.0633 0 TD +[(to)-263 (use)-264 (a)-263 (test-and-set)-263 (protocol)-263 (\(where)-263 (\322set\323)-263 (means)]TJ +-19.33 -1.1667 TD +[(sw)10 (apping)-311 (out)-311 (the)-311 (lock)10 (ed)-311 (v)25 (alue)-311 (1)-312 (and)-311 (\322test\323)-311 (means)-312 (obtaining)-311 (a)-311 (non-lock)10 (ed)-312 (v)25 (alue\).)-311 (When)-311 (the)-311 (test)]TJ +T* +[(f)10 (ails,)-368 (the)-370 (thread)-369 (yields)-369 (and)-369 (optionally)-369 (sleeps)-369 (\(using)-369 (e)16 (xponential)-369 (back-of)25 (f)-369 (a)0 (s)-369 (i)0 (n)-369 ([1]\).)-369 (The)-369 (corre-)]TJ +T* +(sponding)Tj +/N95 1 Tf +3.945 0 TD +(releaseMetaLock\(\))Tj +/N92 1 Tf +10.4775 0 TD +[(operation)-277 (simply)-278 (stores)-278 (back)-278 (the)-277 (release)-277 (bit)-278 (pattern,)-277 (which)-278 (of)]TJ +-14.4225 -1.1667 TD +[(course must be dif)25 (ferent from the lock)10 (ed v)25 (alue 1.)]TJ +/N91 1 Tf +16 0 0 16 72 417.33 Tm +[(7 P)20 (erf)25 (ormance)]TJ +/N92 1 Tf +12 0 0 12 72 394 Tm +[(Usually)65 (,)-373 (good)-373 (performance)-373 (is)-373 (tak)10 (en)-373 (to)-374 (mean)-373 (that)-373 (both)-373 (memory)-373 (and)-373 (CPUs)-373 (are)-374 (used)-373 (ef)25 (\336ciently)65 (.)]TJ +T* +[(Since)-289 (dif)25 (ferent)-289 (systems)-289 (must)-288 (mak)10 (e)-289 (space/time)-288 (trade-of)25 (fs)-289 (dif)25 (ferently)65 (,)-289 (w)0 (e)-288 (shall)-289 (consider)-288 (space)-288 (and)]TJ +T* +[(time)-393 (costs)-393 (for)-392 (our)-393 (synchronization)-392 (algorithm)-393 (separately)66 (.)-393 (All)-393 (our)-393 (measurements)-393 (were)-393 (collected)]TJ +T* +[(using)-256 (a)-257 (near)20 (-FCS)-257 (v)15 (ersion)-257 (of)-256 (EVM)-257 (on)-257 (a)-256 (lightly)-257 (loaded)-256 (4-CPU)-257 (296)-257 (MHz)-257 (UltraSP)92 (ARC)-256 (system)-256 (with)]TJ +T* +[(2)-284 (gigabytes)-284 (of)-284 (RAM)-284 (and)-284 (the)-285 (Solaris\252)-284 (2.6)-284 (operating)-284 (system.)-283 (Some)-284 (measurements)-284 (were)-284 (obtained)]TJ +T* +[(by)-302 (adding)-300 (counters)-302 (to)-301 (the)-301 (code.)-301 (T)80 (o)-302 (minimize)-301 (the)-301 (disturbance)-301 (resulting)-301 (from)-301 (the)-301 (instrumentation,)]TJ +T* +[(we used per)20 (-thread counters that were accumulated into global totals as threads e)17 (xited.)]TJ +/N91 1 Tf +14 0 0 14 72 277.67 Tm +(7.1 Benchmarks)Tj +/N92 1 Tf +12 0 0 12 72 253 Tm +[(T)80 (able)-250 (1)-235 (sho)26 (ws)-235 (the)-234 (benchmarks)-234 (we)-234 (use)-234 (to)-234 (assess)-234 (the)-235 (performance)-234 (of)-234 (our)-234 (synchronization)-234 (code.)-234 (The)]TJ +T* +[(HelloW)80 (orld)-491 (program)-491 (sho)25 (ws)-491 (ho)25 (w)-490 (the)-490 (minimal)-491 (program)-491 (beha)21 (v)15 (es.)-491 (The)-490 (ne)15 (xt)-491 (se)26 (v)15 (e)0 (n)-491 (lines)-491 (sho)25 (w)]TJ +T* +[(widely-kno)25 (wn)-370 (SPECjvm98)-370 (benchmarks)-370 ([26].)-371 (Finally)66 (,)-371 (w)0 (e)-371 (include)-370 (a)-371 (selection)-370 (of)-370 (multi-threaded)]TJ +T* +[(benchmarks,)-270 (some)-270 (of)-270 (which)-270 (perform)-271 (signi\336cant)-270 (amounts)-270 (of)-270 (I/O)-270 (and)-270 (some)-270 (of)-270 (which)-270 (use)-270 (graphics.)]TJ +T* +[(The e)15 (x)15 (ecution times in the table are best of tw)11 (o runs.)]TJ +/N91 1 Tf +14 0 0 14 72 164.67 Tm +[(7.2 Space perf)25 (ormance)]TJ +/N92 1 Tf +12 0 0 12 72 140 Tm +[(W)80 (e)0 ( consider separately the space costs of)]TJ +/N94 1 Tf +16.7758 0 TD +(used)Tj +/N92 1 Tf +1.8333 0 TD +( and)Tj +/N94 1 Tf +1.9442 0 TD +(unused)Tj +/N92 1 Tf +2.8325 0 TD +[( synchronization capability)65 (.)]TJ +/N94 1 Tf +-23.3858 -1.75 TD +[(Cost)-271 (of)-270 (used)-271 (sync)15 (hr)45 (onization)-270 (capability:)-270 (the)-271 (cost)-271 (for)-270 (objects)-270 (that)-271 (ar)37 (e)-270 (actually)-271 (sync)15 (hr)45 (onized)-271 (upon.)]TJ +/N92 1 Tf +0 -1.1667 TD +[(From)-243 (the)-243 (description)-242 (of)-244 (our)-243 (algorithm,)-242 (it)-243 (follo)25 (ws)-243 (that)-242 (this)-243 (space)-243 (cost)-242 (is)-243 (proportional)-243 (to)-243 (the)-243 (number)]TJ +T* +[(of)-371 (lock)-371 (records)-370 (in)-371 (use)-371 (at)-371 (an)15 (y)-371 (point)-371 (in)-371 (time.)-371 (More)-371 (precisely)65 (,)-371 (since)-371 (threads)-371 (rec)15 (ycle)-371 (lock)-370 (records)]TJ +T* +[(locally)-240 (rather)-240 (than)-240 (globally)65 (,)-240 (w)0 (e)-240 (report)-240 (the)-240 (number)-240 (of)-239 (lock)-240 (records)-239 (allocated)-240 (by)-241 (the)-240 (global)-240 (allocator)]TJ +ET +PDFVars/TermAll get exec end end +userdict /pgsave get restore +showpage +%%PageTrailer +%%EndPage +%%Page: 22 22 +%%BeginPageSetup +userdict /pgsave save put +PDFVars begin PDF begin PDFVars/InitAll get exec +%%EndPageSetup +0 0 612 792 RC +1 g +/N93 /ExtGState findRes gs +1 i +0 0 612 792 re +f +BT +/N92 1 Tf +12 0 0 12 300 34.17 Tm +0 g +0 Tc +(19)Tj +-19 20.0692 TD +[(during)-334 (the)-334 (e)15 (x)15 (ecution)-334 (of)-334 (each)-335 (benchmark.)-333 (This)-334 (higher)-334 (number)-334 (re\337ects)-334 (our)-334 (implementation)-334 (more)]TJ +0 -1.1667 TD +[(accurately)65 (.)-314 (I)0 (n)-314 (the)-315 (w)10 (orst)-315 (case,)-314 (a)-315 (program)-314 (will)-314 (synchronize)-314 (on)-314 (e)25 (v)15 (ery)-315 (object)-314 (allocated)-315 (\(see)-314 (Section)]TJ +T* +[(2.2\),)-252 (making)-252 (the)-252 (w)10 (orst-case)-252 (space)-252 (cost)-251 (of)-252 (our)-252 (algorithm,)-252 (as)-252 (well)-251 (as)-252 (of)-252 (an)15 (y)-252 (other)-252 (algorithm)-252 (that)-252 (we)]TJ +T* +[(kno)25 (w)-235 (of,)-235 (proportional)-234 (to)-235 (the)-235 (size)-235 (of)-234 (the)-235 (heap.)-234 (F)15 (ortunately)65 (,)-235 (this)-235 (number)-235 (is)-235 (f)10 (a)0 (r)-235 (more)-235 (pessimistic)-235 (than)]TJ +T* +[(the)-242 (beha)20 (vior)-242 (typical)-243 (programs)-242 (e)15 (xhibit.)-243 (T)80 (able)-250 (2)-243 (sho)26 (ws)-243 (that)-242 (for)-242 (our)-243 (benchmarks,)-242 (the)-243 (total)-242 (number)-242 (of)]TJ +T* +[(lock)-225 (records)-225 (allocated)-225 (is)-225 (v)15 (ery)-225 (small,)-225 (and)-225 (pales)-225 (in)-225 (comparison)-225 (with)-225 (the)-225 (total)-225 (number)-226 (of)-224 (objects)-225 (allo-)]TJ +T* +[(cated.)-240 (Moreo)14 (v)15 (e)0 (r)40 (,)-240 (at)-240 (24)-240 (bytes)-241 (per)-240 (lock)-240 (record,)-240 (e)25 (v)15 (en)-240 (the)-240 (w)10 (orst)-240 (program)-240 (seen,)-240 (the)-240 (v)20 (olano)-240 (serv)15 (er)40 (,)-241 (con-)]TJ +T* +[(sumes)-337 (77)-336 (Kbytes)-336 (for)-336 (lock)-336 (records,)-336 (a)-336 (tri)25 (vial)-337 (amount)-336 (compared)-336 (with)-336 (the)-336 (se)25 (v)15 (eral)-336 (Mbytes)-337 (used)-336 (for)]TJ +T* +0 Tw +(objects and thread stacks.)Tj +/N94 1 Tf +0 -1.75 TD +[(Cost)-272 (of)-272 (unused)-272 (sync)15 (hr)45 (onization)-272 (capability:)-272 (the)-272 (cost)-271 (for)-272 (objects)-272 (that)-271 (ar)37 (e)-272 (n)0 (e)15 (ver)-271 (sync)15 (hr)46 (onized)-272 (upon)]TJ +/N92 1 Tf +38.75 0 TD +(.)Tj +-38.75 -1.1667 TD +[(This)-272 (cost,)-271 (in)-271 (our)-272 (meta-lock)-272 (scheme,)-272 (amounts)-271 (to)-272 (tw)10 (o)-272 (bits)-272 (per)-271 (object.)-272 (Ho)25 (we)25 (v)15 (e)0 (r)40 (,)-272 (an)-272 (alternati)26 (v)15 (e)-272 (vie)25 (w)]TJ +T* +[(is)-251 (that)-252 (the)-252 (cost)-251 (is)-251 (either)-252 (0)-252 (o)0 (r)-251 (1)-252 (w)10 (ord,)-251 (since)-252 (it)-251 (is)-252 (impractical)-251 (to)-252 (ha)20 (v)15 (e)-252 (objects)-251 (of)-252 (fractional)-251 (w)10 (ord)-252 (sizes)]TJ +T* +[(on)-417 (contemporary)-417 (hardw)11 (are.)-417 (Put)-417 (dif)25 (ferently)65 (,)]TJ +/N94 1 Tf +18.3142 0 TD +(if)Tj +/N92 1 Tf +0.9725 0 TD +[(tw)10 (o)-417 (spare)-417 (bits)-417 (can)-417 (be)-417 (found)-417 (in)-417 (objects)-417 (without)]TJ +-19.2867 -1.1667 TD +[(increasing)-291 (object)-292 (sizes,)-291 (our)-291 (locking)-291 (algorithm)-292 (has)-291 (no)-291 (space)-291 (cost)-291 (for)-291 (objects)-291 (that)-292 (are)-291 (not)-291 (synchro-)]TJ +T* +[(nized)-309 (upon.)-308 (Otherwise,)-309 (if)-309 (\336nding)-308 (tw)10 (o)-309 (bits)-309 (requires)-309 (increasing)-308 (the)-309 (size)-309 (of)-309 (objects)-309 (by)-309 (a)-309 (full)-309 (w)10 (ord,)]TJ +9.5708 53.3333 TD +[(T)80 (able 1. Characterization of benchmark programs)]TJ +/N91 1 Tf +10 0 0 10 461.9 687.33 Tm +[(r)18 (eal time, seconds)]TJ +-38.926 -2.9 TD +[(Benchmark)-2573 (Description)-18031 ( #lines)]TJ +8 0 0 8 405.08 662.33 Tm +(a)Tj +10 0 0 10 415.81 658.33 Tm +[(#thr)18 (eads)]TJ +8 0 0 8 452.85 662.33 Tm +(b)Tj +10 0 0 10 469.64 670.33 Tm +(with)Tj +-0.943 -1.2 TD +(extrafast)Tj +4.314 1.2 TD +(without)Tj +-0.249 -1.2 TD +(extrafast)Tj +/N92 1 Tf +-42.822 -1.7 TD +[(Hello)-5351 (Hello w)10 (orld program)-17433 (5)-4004 (1)-2754 (0.7)-2875 (0.7)]TJ +T* +[(_201_compress)-1296 (LZW compression and decompression)-9453 (927)-4004 (1)-2254 (45.8)-2375 (46.0)]TJ +T* +[(_202_jess)-3573 (V)111 (ersion of N)35 (ASA)111 (\325)55 (s)0 ( CLIPS e)15 (xpert system shell)-5111 (10,579)-4004 (1)-2254 (22.3)-2375 (24.8)]TJ +T* +[(_209_db)-4073 (Search and modify a database)-12176 (1,028)-4004 (1)-2254 (72.1)-2375 (87.9)]TJ +T* +[(_213_ja)21 (v)25 (a)0 (c)-3009 (Source to bytecode compiler)-12147 (25,211)-4004 (1)-2254 (42.7)-2375 (48.9)]TJ +T* +[(_222_mpe)15 (g)6 (audio)-650 (Decompress audio \336le)-16202 (n/a)-4004 (1)-2254 (50.4)-2375 (51.3)]TJ +T* +[(_227_mtrt)-3406 (Multi-threaded image rendering)-11342 (3,799)-4004 (2)-2254 (13.2)-2375 (13.6)]TJ +T* +[(_228_jack)-3407 (P)16 (arser generator generating itself)-10887 (8,194)-4004 (1)-2254 (34.5)-2375 (38.0)]TJ +T* +[(_224_richards)-1852 (Fi)25 (v)16 (e)-247 (threads)-246 (running)-246 (multiple)-247 (v)15 (ersions)-246 (of)-246 (O/S)-247 (simulator)-1836 (3,637)-4004 (5)-2254 (17.3)-2375 (19.0)]TJ +T* +[(_233_tmix)-3239 (Thread mix: sort, crc, producer)19 (-consumer)40 (, primes, etc.)-2376 (8,194)-3504 (14)-2254 (28.8)-2375 (29.1)]TJ +T* +[(SwingMark)-2851 (Benchmark and test of swing libraries)-8925 (3,998)-4004 (8)-2254 (51.3)-2375 (51.5)]TJ +T* +[(v)21 (olano serv)14 (er)]TJ +8 0 0 8 126.43 458.33 Tm +(c)Tj +10 0 0 10 148.37 454.33 Tm +[(\322Chat serv)15 (er)41 (,)69 (\323)0 ( reads and distrib)21 (utes messages)-7072 (n/a)-3004 (406)]TJ +33.624 -0.85 TD +[(9.6)-2375 (10.3)]TJ +-41.197 -0.85 TD +[(v)21 (olano client)-2400 (Generates w)11 (ork-load to stress serv)13 (er)-10618 (n/a)-3004 (402)]TJ +0 -1.7 TD +(JWS)Tj +8 0 0 8 91.53 424.33 Tm +(d)Tj +10 0 0 10 148.37 420.33 Tm +[(Ja)21 (v)25 (a)0 ( W)80 (eb Serv)14 (er\252, serving 20,000 requests)-5531 (200,000)-3504 (63)-2254 (25.7)-2375 (27.3)]TJ +ET +0 G +2 J +0 j +0.36 w +3.86 M +[]0 d +203.63 410.15 m +77.64 410.15 l +S +BT +10 0 0 10 71.64 400.33 Tm +[(a.)-385 (Approximate lines of source code in the benchmark itself, e)14 (xcluding class library code.)]TJ +0 -1.5 TD +[(b)40 (.)-369 (Maximum)-327 (number)-327 (of)-327 (acti)25 (v)15 (e)-327 (threads,)-327 (e)15 (xcluding)-327 (three)-327 (system)-327 (threads)-327 (\(\336nalizer)40 (,)-327 (reference)-327 (handler)40 (,)-327 (and)-326 (signal)-327 (dis-)]TJ +1.079 -1.2 TD +[(patcher\).)-300 (All)-300 (SPEC)-300 (programs)-300 (actually)-300 (run)-300 (a)-300 (second)-300 (user)20 (-le)25 (v)15 (e)0 (l)-300 (thread)-300 (\(secondary)-300 (\336nalizer\))-300 (b)20 (u)0 (t)-300 (since)-300 (this)-299 (thread)-301 (is)]TJ +T* +[(v)14 (ery short-li)25 (v)16 (ed, we do not count it.)]TJ +-1.079 -1.5 TD +[(c.)-385 (V)129 (olanoMark v)14 (ersion 2.0.0 b)20 (uild 137 [22].)]TJ +T* +[(d.)-329 (http://www)64 (.sun.com/softw)11 (are/jwebserv)15 (er/inde)14 (x.html.)]TJ +ET +0 J +0.6 w +71.64 414.25 m +71.64 680.75 l +B* +147.37 413.75 m +147.37 681.25 l +B* +373.17 413.75 m +373.17 681.25 l +B* +414.03 413.75 m +414.03 681.25 l +B* +459.07 413.75 m +459.07 697.75 l +B* +499.11 413.75 m +499.11 681.25 l +B* +540.36 414.25 m +540.36 697.75 l +B* +540.61 698 m +458.82 698 l +B* +540.61 681 m +71.39 681 l +B* +540.11 653.25 m +71.89 653.25 l +B* +540.11 650.75 m +71.89 650.75 l +B* +540.11 636.25 m +71.89 636.25 l +B* +540.11 633.75 m +71.89 633.75 l +B* +540.61 618 m +71.39 618 l +B* +540.61 601 m +71.39 601 l +B* +540.61 584 m +71.39 584 l +B* +540.61 567 m +71.39 567 l +B* +540.61 550 m +71.39 550 l +B* +540.61 533 m +71.39 533 l +B* +540.11 517.25 m +71.89 517.25 l +B* +540.11 514.75 m +71.89 514.75 l +B* +540.61 499 m +71.39 499 l +B* +540.61 482 m +71.39 482 l +B* +540.61 465 m +71.39 465 l +B* +459.32 448 m +71.39 448 l +B* +540.61 431 m +71.39 431 l +B* +540.61 414 m +71.39 414 l +B* +PDFVars/TermAll get exec end end +userdict /pgsave get restore +showpage +%%PageTrailer +%%EndPage +%%Page: 23 23 +%%BeginPageSetup +userdict /pgsave save put +PDFVars begin PDF begin PDFVars/InitAll get exec +%%EndPageSetup +0 0 612 792 RC +1 g +/N93 /ExtGState findRes gs +1 i +0 0 612 792 re +f +BT +/N92 1 Tf +12 0 0 12 300 34.17 Tm +0 g +0 Tc +(20)Tj +-19 30.0692 TD +[(then)-287 (a)-288 (dif)25 (ferent)-287 (synchronization)-287 (algorithm)-287 (that)-287 (tak)10 (es)-288 (adv)25 (antage)-287 (of)-287 (a)-288 (full)-287 (w)9 (ord)-287 (of)-287 (memory)-288 (should)]TJ +0 -1.1667 TD +[(\(probably\))-296 (be)-298 (used.)-297 (Thus,)-297 (it)-296 (can)-297 (be)-297 (ar)18 (gued,)-297 (our)-297 (algorithm)-297 (has)-297 (no)-297 (space)-296 (o)15 (v)15 (erhead)-297 (for)-296 (objects)-297 (that)]TJ +T* +0 Tw +(are not synchronized upon.)Tj +/N91 1 Tf +14 0 0 14 72 334.67 Tm +[(7.3 T)18 (ime perf)26 (ormance)]TJ +/N92 1 Tf +12 0 0 12 72 310 Tm +[(W)80 (e)-263 (study)-264 (time)-264 (performance)-263 (of)-264 (our)-264 (algorithm)-263 (in)-264 (tw)10 (o)-263 (w)10 (ays.)-264 (First,)-263 (we)-264 (compare)-263 (the)-264 (cost)-263 (of)-264 (synchro-)]TJ +T* +[(nization)-255 (in)-256 (our)-255 (system)-256 (with)-255 (that)-256 (of)-256 (the)-255 (original)-255 (JVM)-256 (found)-256 (in)-255 (the)-255 (\322JDK)-256 (1.2)]TJ +/N94 1 Tf +30.2517 0 TD +[(Refer)37 (ence)]TJ +/N92 1 Tf +4.2167 0 TD +[(Release)-256 (for)]TJ +-34.4683 -1.1667 TD +[(Solaris.)70 (\323)-298 (F)15 (or)-299 (this)-298 (study)65 (,)-298 (a)0 (s)-299 (e)15 (xplained)-297 (belo)25 (w)65 (,)-298 (we)-298 (use)-299 (synthetic)-298 (benchmarks.)-298 (Second,)-298 (we)-298 (study)-299 (the)]TJ +T* +[(beha)20 (vior)-310 (of)-310 (our)-310 (algorithm)-311 (on)-310 (the)-310 (more)-310 (realistic)-310 (programs)-310 (sho)25 (wn)-310 (in)-310 (T)80 (able)-250 (1.)-311 (W)80 (e)-310 (do)-310 (not)-310 (compare)]TJ +T* +[(the)-285 (absolute)-286 (performance)-286 (of)-285 (the)-285 (tw)9 (o)-285 (JVMs)-286 (since)-286 (the)15 (y)-286 (dif)25 (fer)-286 (in)-285 (man)15 (y)-286 (other)-286 (respects)-285 (than)-285 (the)-286 (syn-)]TJ +T* +(chronization code.)Tj +/N91 1 Tf +0 -2.3333 TD +[(7.3.1 T)18 (ime perf)26 (ormance comparison with the original JVM)]TJ +/N92 1 Tf +0 -1.8333 TD +[(In)-314 (this)-314 (section)-314 (we)-314 (compare)-314 (the)-315 (speed)-314 (of)-314 (our)-314 (synchronization)-314 (algorithm)-314 (in)-314 (EVM,)-315 (using)-315 (the)-314 (e)15 (xtra)]TJ +0 -1.1667 TD +[(f)10 (ast)-239 (e)15 (xtension,)-239 (with)-240 (that)-239 (of)-239 (the)-240 (monitor)-239 (cache)-240 (approach)-240 (in)-239 (the)-240 (original)-239 (JVM.)-240 (T)80 (o)-240 (measure)-239 (the)-239 (speed)]TJ +T* +[(of)-280 (synchronization)-281 (rather)-280 (than)-280 (the)-280 (speed)-280 (of)-281 (conte)15 (xt)-280 (switching)-281 (pro)16 (vided)-280 (by)-281 (the)-280 (underlying)-281 (operat-)]TJ +T* +[(ing)-248 (system,)-248 (we)-247 (use)-248 (programs)-248 (that)-247 (primarily)-248 (do)-247 (uncontended)-248 (synchronization.)-247 (Section)-248 (6.3.2)-247 (sho)25 (ws)]TJ +T* +[(that)-259 (contention)-259 (is)-259 (relati)25 (v)14 (ely)-259 (infrequent)-259 (for)-259 (typical)-259 (Ja)20 (v)25 (a)-258 (programs,)-259 (justifying)-258 (this)-259 (approach,)-259 (at)-259 (least)]TJ +T* +(in part.)Tj +0 -1.75 TD +[(Ideally)65 (,)-325 (w)0 (e)-324 (w)10 (ould)-324 (compare)-324 (dif)24 (ferent)-324 (synchronization)-324 (algorithms)-324 (directly)-324 (by)-324 (implementing)-325 (them)]TJ +0 -1.1667 TD +[(as)-298 (alternati)26 (v)15 (e)0 (s)-297 (i)0 (n)-298 (the)-297 (same)-297 (virtual)-297 (machine.)-298 (In)-297 (our)-298 (case,)-297 (ho)25 (we)25 (v)15 (e)0 (r)40 (,)-297 (this)-297 (approach)-297 (w)9 (a)0 (s)-297 (impractical.)]TJ +T* +[(First,)-308 (the)-307 (implementation)-307 (ef)25 (fort)-307 (is)-307 (non-tri)25 (vial.)-307 (Second,)-307 (an)-308 (algorithm)-307 (added)-307 (quickly)-307 (to)-307 (a)-308 (JVM)-307 (for)]TJ +3.7517 53.4167 TD +[(T)79 (able 2. Objects allocated, objects synchronized on, and lock records all\ +ocated)]TJ +/N91 1 Tf +10 0 0 10 141.62 687.33 Tm +[(Benchmark)-5556 (# objects)-2691 (# objs sync\325ed on)-1574 (# lock r)18 (ecords)]TJ +/N92 1 Tf +0 -1.7 TD +[(Hello)-11867 (2,076)-2970 (262)-250 ( \(12.6%\))-7218 (40)]TJ +T* +[(_201_compress)-7812 (8,917)-2970 (936)-250 ( \(10.5%\))-7218 (40)]TJ +T* +[(_202_jess)-8339 (7,934,141)-2720 (6,545)-250 ( \(0.1%\))-7218 (40)]TJ +T* +[(_209_db)-8839 (3,213,429)-2220 (17,123)-500 (\(0.5%\))-7218 (40)]TJ +T* +[(_213_ja)20 (v)25 (a)0 (c)-7774 (5,912,859)-1720 (351,538)-500 (\(5.9%\))-7218 (40)]TJ +T* +[(_222_mpe)15 (gaudio)-6665 (12,009)-3470 (994)-250 ( \(8.3%\))-7218 (40)]TJ +T* +[(_227_mtrt)-8172 (6,641,320)-2720 (1,195)-500 (\(0.0%\))-7218 (60)]TJ +T* +[(_228_jack)-8173 (6,841,290)-1720 (506,157)-250 ( \(7.4%\))-7218 (40)]TJ +T* +[(_224_richards)-7868 (36,065)-2720 (1,878)-500 (\(5.2%\))-7218 (80)]TJ +T* +[(_233_tmix)-8005 (1,366,985)-1220 (169,823)-250 ( \(12.4%\))-6718 (140)]TJ +T* +[(SwingMark)-7617 (2,345,281)-2220 (69,245)-500 (\(3.0%\))-6718 (100)]TJ +T* +[(v)20 (olano serv)15 (er)-7709 (140,335)-2720 (6,334)-500 (\(4.5%\))-5968 (3,280)]TJ +T* +[(v)20 (olano client)-7915 (661,199)-2720 (3,643)-500 (\(0.6%\))-5968 (3,240)]TJ +T* +[(JWS)-10450 (1,336,170)-1970 (258,960)-500 (\(19%\))-6718 (540)]TJ +ET +0 G +0 J +0 j +0.6 w +3.86 M +[]0 d +140.62 443.25 m +140.62 697.75 l +B* +224.83 442.75 m +224.83 698.25 l +B* +307.01 442.75 m +307.01 698.25 l +B* +389.2 442.75 m +389.2 698.25 l +B* +471.38 443.25 m +471.38 697.75 l +B* +471.63 698 m +140.37 698 l +B* +471.13 682.25 m +140.87 682.25 l +B* +471.13 679.75 m +140.87 679.75 l +B* +471.13 665.25 m +140.87 665.25 l +B* +471.13 662.75 m +140.87 662.75 l +B* +471.63 647 m +140.37 647 l +B* +471.63 630 m +140.37 630 l +B* +471.63 613 m +140.37 613 l +B* +471.63 596 m +140.37 596 l +B* +471.63 579 m +140.37 579 l +B* +471.63 562 m +140.37 562 l +B* +471.13 546.25 m +140.87 546.25 l +B* +471.13 543.75 m +140.87 543.75 l +B* +471.63 528 m +140.37 528 l +B* +471.63 511 m +140.37 511 l +B* +471.63 494 m +140.37 494 l +B* +471.63 477 m +140.37 477 l +B* +471.63 460 m +140.37 460 l +B* +471.63 443 m +140.37 443 l +B* +PDFVars/TermAll get exec end end +userdict /pgsave get restore +showpage +%%PageTrailer +%%EndPage +%%Page: 24 24 +%%BeginPageSetup +userdict /pgsave save put +PDFVars begin PDF begin PDFVars/InitAll get exec +%%EndPageSetup +0 0 612 792 RC +1 g +/N93 /ExtGState findRes gs +1 i +0 0 612 792 re +f +BT +/N92 1 Tf +12 0 0 12 300 34.17 Tm +0 g +0 Tc +(21)Tj +-19 36.1108 TD +[(the)-239 (purpose)-240 (of)-239 (measuring)-239 (will)-239 (be)-239 (at)-240 (an)-238 (inherent)-239 (disadv)25 (antage)-239 (compared)-238 (with)-239 (an)-239 (algorithm)-238 (that)-239 (has)]TJ +0 -1.1667 TD +[(been)-260 (tuned)-260 (with)-260 (the)-260 (rest)-260 (of)-260 (the)-260 (system)-260 (o)15 (v)15 (e)0 (r)-260 (a)-260 (long)-259 (period.)-261 (Third,)-260 (it)-261 (may)-260 (be)-260 (technically)-260 (impossible)]TJ +T* +[(to)-339 (k)10 (eep)-339 (all)-339 (other)-340 (f)10 (actors)-339 (constant,)-339 (since)-339 (each)-340 (algorithm)-339 (may)-340 (tak)10 (e)-339 (adv)26 (antage)-340 (of)-339 (features)-339 (that)-340 (the)]TJ +T* +[(other)-230 (one)-229 (does)-230 (not)-229 (use)-229 (\(e.g.,)-230 (the)-230 (monitor)-229 (cache)-230 (w)10 (orks)-230 (best)-230 (in)-229 (the)-230 (presence)-230 (of)-229 (handles\).)-229 (Since)-230 (EVM)]TJ +T* +[(and)-235 (the)-235 (original)-235 (JVM)-235 (dif)25 (fer)-236 (in)-235 (man)15 (y)-236 (respects,)-235 (comparing)-235 (bottom-line)-235 (performance)-235 (does)-235 (not)-235 (re)25 (v)15 (eal)]TJ +T* +[(much)-585 (about)-584 (the)-585 (tw)10 (o)-584 (systems\325)-584 (synchronization)-584 (code.)-584 (F)15 (ortunately)65 (,)-584 (a)-584 (dif)25 (ferent)-585 (measurement)]TJ +T* +[(approach)-250 (can)-250 (gi)25 (v)15 (e)-250 (us)-250 (the)-250 (information)-250 (we)-251 (w)10 (ant.)-250 (Consider)-250 (a)-250 (program)-1361 (that)-251 (performs)-250 (synchroniza-)]TJ +T* +[(tion.)-285 (Construct)-285 (the)]TJ +/N94 1 Tf +7.7725 0 TD +(baseline)Tj +/N92 1 Tf +3.6183 0 TD +[(program)-1146 (,)-285 (which)-285 (is)-285 (just)-285 (lik)10 (e)-1147 (,)-285 (e)15 (xcept)-285 (that)-285 (all)-285 (synchronization)-285 (has)]TJ +-11.3908 -1.1667 TD +[(been)-275 (stripped)-275 (out.)-276 (The)-276 (dif)25 (ference)-275 (in)-275 (e)15 (x)15 (ecution)-276 (time,)-13274 (,)-276 (re\337ects)-276 (the)]TJ +T* +[(cost)-273 (of)-274 (synchronization.)-273 (Computing)-4229 (for)-273 (EVM)-274 (and)-273 (the)-273 (original)-274 (JVM)-273 (gi)26 (v)15 (e)0 (s)-274 (numbers)-273 (that)]TJ +T* +0 Tw +(can be compared.)Tj +0 -1.75 TD +[(The)-369 (main)-370 (limitation)-369 (of)-369 (this)-369 (approach)-369 (is)-370 (that)-369 (real)-369 (programs)-369 (usually)-368 (rely)-370 (on)-369 (synchronization)-369 (for)]TJ +0 -1.1667 TD +[(their)-256 (correct)-256 (e)15 (x)15 (ecution,)-256 (so)-257 (to)-256 (ensure)-256 (that)-257 (the)-256 (presence)-256 (or)-256 (absence)-256 (of)-256 (synchronization)-256 (does)-257 (not)-256 (oth-)]TJ +T* +[(erwise)-302 (af)25 (fect)-301 (the)-302 (computation,)-301 (we)-302 (limit)-302 (our)-302 (study)-301 (to)-302 (synthetic)-301 (benchmarks.)-302 (T)80 (o)-302 (this)-302 (end,)-302 (we)-302 (con-)]TJ +T* +[(structed)-230 (a)-230 (set)-230 (of)-230 (simple)-230 (benchmarks:)]TJ +/N94 1 Tf +14.7108 0 TD +(SyncMethod)Tj +/N92 1 Tf +5.1733 0 TD +[(calls)-229 (a)-230 (synchronized)-230 (method;)]TJ +/N94 1 Tf +11.8067 0 TD +(SyncStmt)Tj +/N92 1 Tf +3.8958 0 TD +[(e)15 (x)15 (ecutes)]TJ +-35.5867 -1.1667 TD +[(a)-309 (synchronized)-309 (statement;)]TJ +/N94 1 Tf +10.8158 0 TD +(RecSyncMethod)Tj +/N92 1 Tf +6.7517 0 TD +[(calls)-309 (tw)10 (o)-310 (nested)-309 (synchronized)-309 (methods)-310 (on)-309 (an)-310 (object;)]TJ +-17.5675 -1.1667 TD +(and)Tj +/N94 1 Tf +1.7867 0 TD +(RecSyncStmt)Tj +/N92 1 Tf +5.5075 0 TD +[(e)15 (x)15 (ecutes)-343 (a)-343 (pair)-343 (of)-342 (nested)-343 (synchronized)-342 (statements)-343 (on)-342 (an)-343 (object.)-343 (The)-342 (\336rst)-343 (tw)10 (o)]TJ +-7.2942 -1.1667 TD +[(benchmarks)-235 (measure)-235 (the)-235 (cost)-236 (of)-235 (a)-235 (non-recursi)25 (v)15 (e)-235 (lock/unlock)-235 (pair)41 (,)-236 (and)-235 (the)-235 (second)-235 (tw)11 (o)-236 (measure)-235 (the)]TJ +T* +[(sum)-250 (of)-250 (the)-250 (costs)-250 (of)-250 (a)-250 (non-recursi)25 (v)15 (e)-250 (and)-251 (recursi)25 (v)15 (e)-250 (lock/unlock)-250 (pair)55 (.)-250 (Each)-251 (benchmark)-250 (completes)-250 (10)]TJ +T* +[(million)-239 (iterations,)-239 (c)15 (ycling)-238 (through)-239 (an)-238 (array)-239 (of)-239 (length)]TJ +/N94 1 Tf +21.1442 0 TD +(N)Tj +/N92 1 Tf +0.9058 0 TD +[(to)-239 (select)-239 (the)-239 (objects)-239 (to)-239 (synchronize)-239 (on.)-238 (W)80 (e)]TJ +-22.05 -1.1667 TD +(let)Tj +/N94 1 Tf +1.2933 0 TD +(N)Tj +/N92 1 Tf +0.96 0 TD +[(range)-294 (from)-293 (1)-293 (t)0 (o)-294 (512K)-293 (and)-293 (plot)-293 (the)-294 (cost)-293 (per)-293 (iterations;)-293 (see)-294 (Figure)-250 (10.)-293 (As)-293 (one)-293 (w)10 (ould)-294 (e)15 (xpect,)]TJ +-2.2533 -1.1667 TD +[(the)-380 (meta-locking)-380 (scheme)-380 (deli)25 (v)15 (ers)-380 (unchanged)-380 (performance)-379 (re)15 (gardless)-380 (of)-380 (the)-380 (number)-380 (of)-379 (objects)]TJ +T* +[(synchronized)-288 (upon)-288 (whereas)-287 (the)-288 (monitor)-288 (cache)-288 (approach)-288 (suf)25 (fers)-288 (an)-288 (increasing)-288 (slo)25 (wdo)25 (wn,)-288 (despite)]TJ +T* +[(the)-325 (f)10 (act)-324 (that)-325 (in)-324 (all)-324 (of)-324 (these)-324 (tests,)-324 (no)-324 (more)-324 (than)-324 (one)-324 (object)-324 (is)-325 (lock)10 (ed)-324 (at)-325 (an)16 (y)-324 (time.)-324 (The)-324 (graph)-325 (also)]TJ +T* +[(sho)25 (ws)-271 (that)-272 (the)-270 (absolute)-272 (cost)-271 (of)-271 (a)-271 (synchronization)-271 (operation)-271 (in)-271 (EVM)-271 (is)-270 (al)10 (w)10 (ays)-271 (signi\336cantly)-271 (lo)25 (wer)]TJ +T* +[(than)-313 (in)-313 (the)-313 (original)-313 (JVM.)-313 (F)15 (o)0 (r)-313 (e)15 (xample,)-312 (a)-313 (non-recursi)26 (v)15 (e)-313 (synchronized)-313 (method)-312 (call,)-313 (the)-313 (most)-312 (fre-)]TJ +T* +[(quent)-276 (form)-277 (of)-277 (synchronization,)-276 (e)15 (x)15 (ecutes)-277 (in)-277 (about)-276 (220)-277 (ns)-277 (on)-277 (EVM)-276 (b)20 (u)0 (t)-277 (tak)10 (es)-277 (550)-277 (ns)-277 (to)-276 (1500)-277 (ns)-277 (on)]TJ +T* +(the original JVM.)Tj +/N94 1 Tf +27.1967 25.0833 TD +(P)Tj +-12.0492 -1.1667 TD +(P)Tj +ET +0 G +2 J +0 j +0.6 w +3.86 M +[]0 d +259.9 380.24 m +254.97 380.24 l +S +BT +12 0 0 12 356.76 369.5 Tm +(P)Tj +-2.5567 -1.1667 TD +0.059 Tc +[(sync)-404 (P)]TJ +/N96 1 Tf +2.0133 0 TD +0.753 Tc +(\(\))Tj +/N94 1 Tf +2.9817 0 TD +0.059 Tc +[(time)-405 (P)]TJ +/N96 1 Tf +1.9583 0 TD +0.753 Tc +(\(\))Tj +/N94 1 Tf +2.4183 0 TD +0.059 Tc +[(time)-405 (P)]TJ +/N96 1 Tf +1.9583 0 TD +0.753 Tc +(\(\))Tj +/N92 1 Tf +-2.7083 0 TD +0 Tc +(\320)Tj +-4.69 0 TD +(=)Tj +ET +473.02 366.24 m +468.08 366.24 l +S +BT +/N94 1 Tf +12 0 0 12 249.12 341.5 Tm +0.059 Tc +[(sync)-404 (P)]TJ +/N96 1 Tf +2.0133 0 TD +0.753 Tc +(\(\))Tj +/N92 1 Tf +0 9 -9 0 93.5 568.5 Tm +0 Tc +(Synchronization cost per iteration \(ns\))Tj +9 0 0 9 211.5 546.5 Tm +(Number of objects \()Tj +/N94 1 Tf +8.0256 0 TD +(N)Tj +/N92 1 Tf +0.6678 0 TD +(\))Tj +ET +1 g +96.5 716.5 442 -160 re +f* +1 J +1 j +0.7 w +122.5 570.5 m +374.5 570.5 l +122.5 570.5 m +122.5 565.5 l +135.763 570.5 m +135.763 568.5 l +143.522 570.5 m +143.522 568.5 l +149.026 570.5 m +149.026 568.5 l +153.296 570.5 m +153.296 568.5 l +156.785 570.5 m +156.785 568.5 l +159.734 570.5 m +159.734 568.5 l +162.289 570.5 m +162.289 568.5 l +164.543 570.5 m +164.543 568.5 l +166.559 570.5 m +166.559 565.5 l +179.822 570.5 m +179.822 568.5 l +187.581 570.5 m +187.581 568.5 l +193.086 570.5 m +193.086 568.5 l +197.355 570.5 m +197.355 568.5 l +200.844 570.5 m +200.844 568.5 l +203.794 570.5 m +203.794 568.5 l +206.349 570.5 m +206.349 568.5 l +208.602 570.5 m +208.602 568.5 l +210.618 570.5 m +210.618 565.5 l +223.882 570.5 m +223.882 568.5 l +231.64 570.5 m +231.64 568.5 l +237.145 570.5 m +237.145 568.5 l +241.415 570.5 m +241.415 568.5 l +244.903 570.5 m +244.903 568.5 l +247.853 570.5 m +247.853 568.5 l +250.408 570.5 m +250.408 568.5 l +252.662 570.5 m +252.662 568.5 l +254.678 570.5 m +254.678 565.5 l +267.941 570.5 m +267.941 568.5 l +275.699 570.5 m +275.699 568.5 l +281.204 570.5 m +281.204 568.5 l +285.474 570.5 m +285.474 568.5 l +288.963 570.5 m +288.963 568.5 l +291.912 570.5 m +291.912 568.5 l +294.467 570.5 m +294.467 568.5 l +296.721 570.5 m +296.721 568.5 l +298.737 570.5 m +298.737 565.5 l +312 570.5 m +312 568.5 l +319.759 570.5 m +319.759 568.5 l +325.263 570.5 m +325.263 568.5 l +329.533 570.5 m +329.533 568.5 l +333.022 570.5 m +333.022 568.5 l +335.971 570.5 m +335.971 568.5 l +338.526 570.5 m +338.526 568.5 l +340.78 570.5 m +340.78 568.5 l +342.796 570.5 m +342.796 565.5 l +356.059 570.5 m +356.059 568.5 l +363.818 570.5 m +363.818 568.5 l +369.323 570.5 m +369.323 568.5 l +373.592 570.5 m +373.592 568.5 l +S +BT +/N97 1 Tf +9 0 0 9 120.25 557.1 Tm +0 g +[(1)-4145 (1)0 (0)-3646 (100)-3146 (1000)-2646 (10000)-2146 (100000)]TJ +ET +122.5 570.5 m +122.5 714.5 l +122.5 570.5 m +117.5 570.5 l +122.5 575.264 m +120.5 575.264 l +122.5 580.028 m +120.5 580.028 l +122.5 584.792 m +120.5 584.792 l +122.5 589.556 m +120.5 589.556 l +122.5 594.32 m +120.5 594.32 l +122.5 599.084 m +120.5 599.084 l +122.5 603.848 m +120.5 603.848 l +122.5 608.612 m +120.5 608.612 l +122.5 613.376 m +120.5 613.376 l +122.5 618.14 m +117.5 618.14 l +122.5 622.904 m +120.5 622.904 l +122.5 627.667 m +120.5 627.667 l +122.5 632.431 m +120.5 632.431 l +122.5 637.195 m +120.5 637.195 l +122.5 641.959 m +120.5 641.959 l +122.5 646.723 m +120.5 646.723 l +122.5 651.487 m +120.5 651.487 l +122.5 656.251 m +120.5 656.251 l +122.5 661.015 m +120.5 661.015 l +122.5 665.779 m +117.5 665.779 l +122.5 670.543 m +120.5 670.543 l +122.5 675.307 m +120.5 675.307 l +122.5 680.071 m +120.5 680.071 l +122.5 684.835 m +120.5 684.835 l +122.5 689.599 m +120.5 689.599 l +122.5 694.363 m +120.5 694.363 l +122.5 699.127 m +120.5 699.127 l +122.5 703.891 m +120.5 703.891 l +122.5 708.655 m +120.5 708.655 l +122.5 713.419 m +117.5 713.419 l +S +BT +9 0 0 9 110 567.8 Tm +(0)Tj +-1.4999 5.2933 TD +(1000)Tj +T* +(2000)Tj +T* +(3000)Tj +ET +122.5 659.6 m +135.763 657.328 l +149.026 660.339 l +162.289 678.299 l +175.553 695.158 l +188.816 698.555 l +202.079 701.423 l +215.342 713.924 l +228.605 711.975 l +241.868 713.219 l +255.132 713.414 l +268.395 712.818 l +281.658 713.933 l +294.921 713.314 l +308.184 714.5 l +321.447 713.547 l +334.711 713.642 l +347.974 708.497 l +361.237 711.623 l +374.5 711.032 l +S +124.52 659.58 m +124.52 660.684 123.624 661.58 122.52 661.58 c +121.416 661.58 120.52 660.684 120.52 659.58 c +120.52 658.476 121.416 657.58 122.52 657.58 c +123.624 657.58 124.52 658.476 124.52 659.58 c +B +137.78 657.3 m +137.78 658.404 136.884 659.3 135.78 659.3 c +134.676 659.3 133.78 658.404 133.78 657.3 c +133.78 656.196 134.676 655.3 135.78 655.3 c +136.884 655.3 137.78 656.196 137.78 657.3 c +B +151.04 660.3 m +151.04 661.404 150.144 662.3 149.04 662.3 c +147.936 662.3 147.04 661.404 147.04 660.3 c +147.04 659.196 147.936 658.3 149.04 658.3 c +150.144 658.3 151.04 659.196 151.04 660.3 c +B +164.3 678.3 m +164.3 679.404 163.404 680.3 162.3 680.3 c +161.196 680.3 160.3 679.404 160.3 678.3 c +160.3 677.196 161.196 676.3 162.3 676.3 c +163.404 676.3 164.3 677.196 164.3 678.3 c +B +177.56 695.16 m +177.56 696.264 176.664 697.16 175.56 697.16 c +174.456 697.16 173.56 696.264 173.56 695.16 c +173.56 694.056 174.456 693.16 175.56 693.16 c +176.664 693.16 177.56 694.056 177.56 695.16 c +B +190.82 698.58 m +190.82 699.684 189.924 700.58 188.82 700.58 c +187.716 700.58 186.82 699.684 186.82 698.58 c +186.82 697.476 187.716 696.58 188.82 696.58 c +189.924 696.58 190.82 697.476 190.82 698.58 c +B +204.08 701.46 m +204.08 702.564 203.184 703.46 202.08 703.46 c +200.976 703.46 200.08 702.564 200.08 701.46 c +200.08 700.356 200.976 699.46 202.08 699.46 c +203.184 699.46 204.08 700.356 204.08 701.46 c +B +217.34 713.94 m +217.34 715.044 216.444 715.94 215.34 715.94 c +214.236 715.94 213.34 715.044 213.34 713.94 c +213.34 712.836 214.236 711.94 215.34 711.94 c +216.444 711.94 217.34 712.836 217.34 713.94 c +B +230.6 711.96 m +230.6 713.064 229.704 713.96 228.6 713.96 c +227.496 713.96 226.6 713.064 226.6 711.96 c +226.6 710.856 227.496 709.96 228.6 709.96 c +229.704 709.96 230.6 710.856 230.6 711.96 c +B +243.86 713.22 m +243.86 714.324 242.964 715.22 241.86 715.22 c +240.756 715.22 239.86 714.324 239.86 713.22 c +239.86 712.116 240.756 711.22 241.86 711.22 c +242.964 711.22 243.86 712.116 243.86 713.22 c +B +257.12 713.4 m +257.12 714.504 256.224 715.4 255.12 715.4 c +254.016 715.4 253.12 714.504 253.12 713.4 c +253.12 712.296 254.016 711.4 255.12 711.4 c +256.224 711.4 257.12 712.296 257.12 713.4 c +B +270.38 712.8 m +270.38 713.904 269.484 714.8 268.38 714.8 c +267.276 714.8 266.38 713.904 266.38 712.8 c +266.38 711.696 267.276 710.8 268.38 710.8 c +269.484 710.8 270.38 711.696 270.38 712.8 c +B +283.64 713.94 m +283.64 715.044 282.744 715.94 281.64 715.94 c +280.536 715.94 279.64 715.044 279.64 713.94 c +279.64 712.836 280.536 711.94 281.64 711.94 c +282.744 711.94 283.64 712.836 283.64 713.94 c +B +296.96 713.34 m +296.96 714.444 296.064 715.34 294.96 715.34 c +293.856 715.34 292.96 714.444 292.96 713.34 c +292.96 712.236 293.856 711.34 294.96 711.34 c +296.064 711.34 296.96 712.236 296.96 713.34 c +B +310.16 714.48 m +310.16 715.584 309.264 716.48 308.16 716.48 c +307.056 716.48 306.16 715.584 306.16 714.48 c +306.16 713.376 307.056 712.48 308.16 712.48 c +309.264 712.48 310.16 713.376 310.16 714.48 c +B +323.48 713.58 m +323.48 714.684 322.584 715.58 321.48 715.58 c +320.376 715.58 319.48 714.684 319.48 713.58 c +319.48 712.476 320.376 711.58 321.48 711.58 c +322.584 711.58 323.48 712.476 323.48 713.58 c +B +336.68 713.64 m +336.68 714.744 335.784 715.64 334.68 715.64 c +333.576 715.64 332.68 714.744 332.68 713.64 c +332.68 712.536 333.576 711.64 334.68 711.64 c +335.784 711.64 336.68 712.536 336.68 713.64 c +B +350 708.48 m +350 709.584 349.104 710.48 348 710.48 c +346.896 710.48 346 709.584 346 708.48 c +346 707.376 346.896 706.48 348 706.48 c +349.104 706.48 350 707.376 350 708.48 c +B +363.2 711.66 m +363.2 712.764 362.304 713.66 361.2 713.66 c +360.096 713.66 359.2 712.764 359.2 711.66 c +359.2 710.556 360.096 709.66 361.2 709.66 c +362.304 709.66 363.2 710.556 363.2 711.66 c +B +376.52 711.06 m +376.52 712.164 375.624 713.06 374.52 713.06 c +373.416 713.06 372.52 712.164 372.52 711.06 c +372.52 709.956 373.416 709.06 374.52 709.06 c +375.624 709.06 376.52 709.956 376.52 711.06 c +B +122.5 624.561 m +135.763 626.334 l +149.026 626.191 l +162.289 644.098 l +175.553 656.442 l +188.816 662.301 l +202.079 661.063 l +215.342 672.11 l +228.605 674.073 l +241.868 673.454 l +255.132 673.497 l +268.395 674.621 l +281.658 673.254 l +294.921 673.13 l +308.184 673.42 l +321.447 672.901 l +334.711 672.977 l +347.974 669.524 l +361.237 671.772 l +374.5 672.577 l +S +120.5 622.561 4 4 re +B +133.763 624.334 4 4 re +B +147.026 624.191 4 4 re +B +160.289 642.098 4 4 re +B +173.553 654.442 4 4 re +B +186.816 660.301 4 4 re +B +200.079 659.063 4 4 re +B +213.342 670.11 4 4 re +B +226.605 672.073 4 4 re +B +239.868 671.454 4 4 re +B +253.132 671.497 4 4 re +B +266.395 672.621 4 4 re +B +279.658 671.254 4 4 re +B +292.921 671.13 4 4 re +B +306.184 671.42 4 4 re +B +319.447 670.901 4 4 re +B +332.711 670.977 4 4 re +B +345.974 667.524 4 4 re +B +359.237 669.772 4 4 re +B +372.5 670.577 4 4 re +B +122.5 611.246 m +135.763 611.051 l +149.026 612.752 l +162.289 638.367 l +175.553 642.564 l +188.816 648.457 l +202.079 651.23 l +215.342 661.649 l +228.605 663.126 l +241.868 662.063 l +255.132 662.935 l +268.395 662.692 l +281.658 663.33 l +294.921 662.806 l +308.184 662.754 l +321.447 662.582 l +334.711 662.006 l +347.974 660.043 l +361.237 664.121 l +374.5 663.492 l +120.5 609.246 m +124.5 613.246 l +120.5 613.246 m +124.5 609.246 l +133.763 609.051 m +137.763 613.051 l +133.763 613.051 m +137.763 609.051 l +147.026 610.752 m +151.026 614.752 l +147.026 614.752 m +151.026 610.752 l +160.289 636.367 m +164.289 640.367 l +160.289 640.367 m +164.289 636.367 l +173.553 640.564 m +177.553 644.564 l +173.553 644.564 m +177.553 640.564 l +186.816 646.457 m +190.816 650.457 l +186.816 650.457 m +190.816 646.457 l +200.079 649.23 m +204.079 653.23 l +200.079 653.23 m +204.079 649.23 l +213.342 659.649 m +217.342 663.649 l +213.342 663.649 m +217.342 659.649 l +226.605 661.126 m +230.605 665.126 l +226.605 665.126 m +230.605 661.126 l +239.868 660.063 m +243.868 664.063 l +239.868 664.063 m +243.868 660.063 l +253.132 660.935 m +257.132 664.935 l +253.132 664.935 m +257.132 660.935 l +266.395 660.692 m +270.395 664.692 l +266.395 664.692 m +270.395 660.692 l +279.658 661.33 m +283.658 665.33 l +279.658 665.33 m +283.658 661.33 l +292.921 660.806 m +296.921 664.806 l +292.921 664.806 m +296.921 660.806 l +306.184 660.754 m +310.184 664.754 l +306.184 664.754 m +310.184 660.754 l +319.447 660.582 m +323.447 664.583 l +319.447 664.583 m +323.447 660.582 l +332.711 660.006 m +336.711 664.006 l +332.711 664.006 m +336.711 660.006 l +345.974 658.043 m +349.974 662.043 l +345.974 662.043 m +349.974 658.043 l +359.237 662.121 m +363.237 666.121 l +359.237 666.121 m +363.237 662.121 l +372.5 661.492 m +376.5 665.492 l +372.5 665.492 m +376.5 661.492 l +122.5 596.664 m +135.763 596.807 l +149.026 596.416 l +162.289 615.467 l +175.553 627.363 l +188.816 631.264 l +202.079 632.45 l +215.342 643.465 l +228.605 642.255 l +241.868 642.536 l +255.132 643.16 l +268.395 642.498 l +281.658 642.359 l +294.921 642.398 l +308.184 643.074 l +321.447 642.812 l +334.711 642.722 l +347.974 641.926 l +361.237 641.864 l +374.5 641.612 l +S +124.5 594.664 m +122.5 598.664 l +120.5 594.664 l +b +137.763 594.807 m +135.763 598.807 l +133.763 594.807 l +b +151.026 594.416 m +149.026 598.416 l +147.026 594.416 l +b +164.289 613.467 m +162.289 617.467 l +160.289 613.467 l +b +177.553 625.363 m +175.553 629.363 l +173.553 625.363 l +b +190.816 629.264 m +188.816 633.264 l +186.816 629.264 l +b +204.079 630.45 m +202.079 634.45 l +200.079 630.45 l +b +217.342 641.465 m +215.342 645.465 l +213.342 641.465 l +b +230.605 640.255 m +228.605 644.255 l +226.605 640.255 l +b +243.868 640.536 m +241.868 644.536 l +239.868 640.536 l +b +257.132 641.16 m +255.132 645.16 l +253.132 641.16 l +b +270.395 640.498 m +268.395 644.498 l +266.395 640.498 l +b +283.658 640.359 m +281.658 644.359 l +279.658 640.359 l +b +296.921 640.398 m +294.921 644.398 l +292.921 640.398 l +b +310.184 641.074 m +308.184 645.074 l +306.184 641.074 l +b +323.447 640.812 m +321.447 644.812 l +319.447 640.812 l +b +336.711 640.722 m +334.711 644.722 l +332.711 640.722 l +b +349.974 639.926 m +347.974 643.926 l +345.974 639.926 l +b +363.237 639.864 m +361.237 643.864 l +359.237 639.864 l +b +376.5 639.612 m +374.5 643.612 l +372.5 639.612 l +b +122.5 596.602 m +135.763 596.749 l +149.026 596.749 l +162.289 596.716 l +175.553 596.64 l +188.816 596.468 l +202.079 596.43 l +215.342 596.726 l +228.605 596.54 l +241.868 596.826 l +255.132 596.435 l +268.395 596.473 l +281.658 596.23 l +294.921 596.516 l +308.184 596.497 l +321.447 596.526 l +334.711 596.645 l +347.974 596.93 l +361.237 596.64 l +374.5 597.159 l +122.5 585.021 m +135.763 584.778 l +149.026 584.949 l +162.289 584.978 l +175.553 584.52 l +188.816 584.668 l +202.079 584.74 l +215.342 584.963 l +228.605 584.406 l +241.868 585.021 l +255.132 584.725 l +268.395 584.897 l +281.658 585.044 l +294.921 585.073 l +308.184 584.906 l +321.447 584.787 l +334.711 584.811 l +347.974 584.463 l +361.237 584.72 l +374.5 584.849 l +120.5 585.021 m +124.5 585.021 l +122.5 583.021 m +122.5 587.021 l +133.763 584.778 m +137.763 584.778 l +135.763 582.778 m +135.763 586.778 l +147.026 584.949 m +151.026 584.949 l +149.026 582.949 m +149.026 586.949 l +160.289 584.978 m +164.289 584.978 l +162.289 582.978 m +162.289 586.978 l +173.553 584.52 m +177.553 584.52 l +175.553 582.52 m +175.553 586.52 l +186.816 584.668 m +190.816 584.668 l +188.816 582.668 m +188.816 586.668 l +200.079 584.74 m +204.079 584.74 l +202.079 582.74 m +202.079 586.74 l +213.342 584.963 m +217.342 584.963 l +215.342 582.963 m +215.342 586.963 l +226.605 584.406 m +230.605 584.406 l +228.605 582.406 m +228.605 586.406 l +239.868 585.021 m +243.868 585.021 l +241.868 583.021 m +241.868 587.021 l +253.132 584.725 m +257.132 584.725 l +255.132 582.725 m +255.132 586.725 l +266.395 584.897 m +270.395 584.897 l +268.395 582.897 m +268.395 586.897 l +279.658 585.044 m +283.658 585.044 l +281.658 583.044 m +281.658 587.044 l +292.921 585.073 m +296.921 585.073 l +294.921 583.073 m +294.921 587.073 l +306.184 584.906 m +310.184 584.906 l +308.184 582.906 m +308.184 586.906 l +319.447 584.787 m +323.447 584.787 l +321.447 582.787 m +321.447 586.787 l +332.711 584.811 m +336.711 584.811 l +334.711 582.811 m +334.711 586.811 l +345.974 584.463 m +349.974 584.463 l +347.974 582.463 m +347.974 586.463 l +359.237 584.72 m +363.237 584.72 l +361.237 582.72 m +361.237 586.72 l +372.5 584.849 m +376.5 584.849 l +374.5 582.849 m +374.5 586.849 l +S +[1 3.2 ]0 d +122.5 585.368 m +135.763 585.535 l +149.026 585.759 l +162.289 585.549 l +175.553 585.468 l +188.816 586.154 l +202.079 585.492 l +215.342 585.649 l +228.605 585.411 l +241.868 585.287 l +255.132 585.278 l +268.395 584.935 l +281.658 584.816 l +294.921 584.873 l +308.184 584.987 l +321.447 585.673 l +334.711 585.716 l +347.974 585.797 l +361.237 585.768 l +374.5 586.193 l +S +[4 ]0 d +122.5 581.338 m +135.763 581.348 l +149.026 581.395 l +162.289 581.405 l +175.553 581.348 l +188.816 580.957 l +202.079 581.167 l +215.342 581.224 l +228.605 581.281 l +241.868 581.295 l +255.132 581.386 l +268.395 581.295 l +281.658 581.305 l +294.921 581.309 l +308.184 581.314 l +321.447 581.429 l +334.711 581.552 l +347.974 581.362 l +361.237 581.476 l +374.5 581.938 l +S +[]0 d +389.5 680.3 m +413.5 680.3 l +S +403.52 680.28 m +403.52 681.384 402.624 682.28 401.52 682.28 c +400.416 682.28 399.52 681.384 399.52 680.28 c +399.52 679.176 400.416 678.28 401.52 678.28 c +402.624 678.28 403.52 679.176 403.52 680.28 c +B +BT +9 0 0 9 417.5 677.6 Tm +(Original-RecSyncStmt)Tj +ET +389.5 669.5 m +413.5 669.5 l +S +399.5 667.5 4 4 re +B +BT +9 0 0 9 417.5 666.8 Tm +(Original-RecSyncMethod )Tj +ET +389.5 658.7 m +413.5 658.7 l +399.5 656.7 m +403.5 660.7 l +399.5 660.7 m +403.5 656.7 l +S +BT +9 0 0 9 417.5 656 Tm +(Original-SyncStmt)Tj +ET +389.5 647.9 m +413.5 647.9 l +S +403.5 645.9 m +401.5 649.9 l +399.5 645.9 l +b +BT +9 0 0 9 417.5 645.2 Tm +(Original-SyncMethod)Tj +ET +389.5 637.1 m +413.5 637.1 l +S +BT +9 0 0 9 417.5 634.4 Tm +(EVM-RecSyncStmt)Tj +ET +389.5 626.3 m +413.5 626.3 l +399.5 626.3 m +403.5 626.3 l +401.5 624.3 m +401.5 628.3 l +S +BT +9 0 0 9 417.5 623.6 Tm +(EVM-RecSyncMethod)Tj +ET +[1 3.2 ]0 d +389.5 615.5 m +413.5 615.5 l +S +BT +9 0 0 9 417.5 612.8 Tm +(EVM-SyncStmt)Tj +ET +[4 ]0 d +389.5 604.7 m +413.5 604.7 l +S +BT +9 0 0 9 417.5 602 Tm +(EVM-SyncMethod)Tj +/N92 1 Tf +12 0 0 12 225.25 516.5 Tm +(Figure 10. Cost of synchronization)Tj +ET +PDFVars/TermAll get exec end end +userdict /pgsave get restore +showpage +%%PageTrailer +%%EndPage +%%Page: 25 25 +%%BeginPageSetup +userdict /pgsave save put +PDFVars begin PDF begin PDFVars/InitAll get exec +%%EndPageSetup +0 0 612 792 RC +1 g +/N93 /ExtGState findRes gs +1 i +0 0 612 792 re +f +BT +/N92 1 Tf +12 0 0 12 300 34.17 Tm +0 g +0 Tc +(22)Tj +/N91 1 Tf +-19 56.4858 TD +0 Tw +[(7.3.2 Beha)26 (vior of our algorithm on r)18 (ealistic pr)18 (ograms)]TJ +/N92 1 Tf +0 -1.8333 TD +[(Consider)-325 (no)25 (w)-325 (the)-325 (algorithm\325)55 (s)-324 (beha)20 (vior)-325 (on)-325 (realistic)-324 (programs.)-324 (W)80 (e)-325 (\336rst)-325 (study)-325 (the)-325 (\322pure\323)-325 (form)-324 (of)]TJ +0 -1.1667 TD +[(the)-231 (meta-lock)-231 (algorithm,)-232 (without)-231 (e)16 (xtra)-232 (f)11 (ast)-231 (locking)-232 (and)-231 (unlocking.)-231 (In)-231 (this)-231 (case,)-231 (each)-231 (monitor)20 (-le)25 (v)15 (e)0 (l)]TJ +T* +[(synchronization)-297 (operation)-298 (in)40 (v)20 (olv)16 (es)-299 (a)]TJ +/N95 1 Tf +15.115 0 TD +(getMetaLock\(\))Tj +/N92 1 Tf +8.0983 0 TD +(and)Tj +/N95 1 Tf +1.7417 0 TD +(releaseMetaLock\(\))Tj +/N92 1 Tf +10.4983 0 TD +[(call.)-298 (The)]TJ +-35.4533 -1.1667 TD +[(left)-309 (half)-308 (of)-308 (T)79 (able)-250 (3)-308 (sho)25 (ws)-308 (that)-308 (the)-309 (f)10 (ast)-309 (path)-308 (is)-309 (tak)10 (en)-308 (in)-309 (all)-308 (b)20 (u)0 (t)-309 (a)0 (n)-308 (e)15 (xtremely)-308 (small)-309 (fraction)-308 (of)-309 (the)]TJ +T* +[(cases;)-337 (that)-337 (is,)-337 (meta-lock)-337 (contention)-337 (is)-337 (e)15 (xtremely)-337 (rare.)-337 (W)80 (e)-337 (instrumented)-337 (only)-337 (meta-lock)-337 (acquisi-)]TJ +T* +[(tion,)-281 (since)-281 (the)-281 (algorithm)-281 (is)-280 (such)-281 (that)-281 (the)-280 (number)-281 (of)-281 (f)11 (ast/slo)25 (w)]TJ +/N95 1 Tf +24.7475 0 TD +(getMetaLock\(\))Tj +/N92 1 Tf +8.0808 0 TD +[(calls)-280 (equals)-282 (the)]TJ +-32.8283 -1.1667 TD +[(number of f)10 (ast/slo)25 (w)]TJ +/N95 1 Tf +8.2142 0 TD +(releaseMetaLock\(\))Tj +/N92 1 Tf +10.2 0 TD +( calls.)Tj +-18.4142 -30 TD +[(Ha)20 (ving)-234 (con\336rmed)-233 (that)-234 (the)-234 (meta-locking)-233 (f)10 (ast)-234 (paths)-233 (dominate,)-234 (let)-233 (us)-233 (study)-234 (those)-234 (f)11 (ast)-234 (paths)-233 (on)-234 (a)-234 (typ-)]TJ +0 -1.1667 TD +[(ical)-303 (RISC)-304 (processor)56 (.)-304 (Figure)-250 (11)-304 (sho)25 (ws)-304 (the)-304 (SP)92 (ARC)-304 (instructions)-304 (that)-303 (result)-305 (from)-303 (translating)-304 (a)-304 (syn-)]TJ +T* +(chronization operation of the form:)Tj +/N95 1 Tf +10 0 0 10 108 199.33 Tm +(multiUseWord = getMetaLock\(ee, obj\);)Tj +2.4 -1.2 TD +(newMultiUseWord = bodyOfSynchronizationOperation\(ee, obj,)Tj +29.4 -1.2 TD +(multiUseWord\);)Tj +-31.8 -1.2 TD +(releaseMetaLock\(ee, obj, newMultiUseWord\);)Tj +/N92 1 Tf +12 0 0 12 72 143 Tm +[(On)-345 (entry)65 (,)-346 (w)0 (e)-346 (assume)-345 (that)-346 (re)15 (gister)]TJ +/N95 1 Tf +14.1292 0 TD +(%i0)Tj +/N92 1 Tf +2.1458 0 TD +[(holds)-345 (the)-346 (address)-345 (of)-345 (the)-345 (e)15 (x)15 (ecution)-346 (en)41 (vironment)]TJ +/N95 1 Tf +19.735 0 TD +(ee)Tj +/N92 1 Tf +1.5458 0 TD +(and)Tj +/N95 1 Tf +-37.5558 -1.1667 TD +(%i1)Tj +/N92 1 Tf +2.0475 0 TD +[(holds)-247 (the)-247 (address)-248 (of)-247 (an)-247 (object)]TJ +/N95 1 Tf +12.0925 0 TD +(obj)Tj +/N92 1 Tf +1.8 0 TD +[(.)-247 (I)0 (t)-247 (tak)10 (es)-247 (se)25 (v)15 (e)0 (n)-248 (instructions)-247 (to)-247 (perform)-247 (the)-247 (f)10 (ast)-247 (path)]TJ +/N95 1 Tf +20.66 0 TD +(get-)Tj +-36.6 -1.1667 TD +(MetaLock\(\))Tj +/N92 1 Tf +6 0 TD +[(,)-302 (including)-303 (e)15 (xtracting)-302 (the)-302 (high)-303 (30)-302 (bits)-303 (of)-302 (the)-303 (multi-use)-303 (w)10 (ord)-303 (into)-302 (one)-303 (re)15 (gister)-302 (and)]TJ +-6 -1.1667 TD +[(the)-242 (lo)26 (w)-242 (2)-243 (bits)-242 (\(the)-242 (lock)-242 (state\))-242 (into)-241 (another)55 (.)-242 (The)-242 (code)-242 (for)-242 (the)-241 (body)-242 (of)-242 (the)-242 (synchronization)-242 (operation)]TJ +T* +[(w)10 (ould)-344 (follo)25 (w)64 (.)-343 (At)-343 (the)-344 (end,)-343 (we)-343 (ha)20 (v)15 (e)-344 (4)-343 (instructions)-344 (for)-344 (the)-343 (f)10 (ast)-343 (path)-344 (of)]TJ +/N95 1 Tf +28.55 0 TD +(releaseMetaLock\(\))Tj +/N92 1 Tf +10.2 0 TD +(.)Tj +-27.9783 41.25 TD +[(T)80 (able 3. Frequenc)16 (y of meta-lock contention)]TJ +/N91 1 Tf +10 0 0 10 220.65 557.33 Tm +[(without extra fast)-10388 (with extra fast)]TJ +-12.542 -1.7 TD +[(Benchmark)-3832 (# getMetaLock)-1246 (#getMetaLockSlo)10 (w)-1246 (#)0 ( getMetaLock)-1375 (#getMetaLockSlo)10 (w)]TJ +/N92 1 Tf +T* +[(Hello)-11647 (3,054)-8059 (0)-6309 (1,418)-8317 (0)]TJ +T* +[(_201_compress)-7092 (22,180)-8059 (3)-6309 (2,269)-8317 (2)]TJ +T* +[(_202_jess)-8119 (9,619,742)-7559 (19)-6309 (4,171)-7817 (10)]TJ +T* +[(_209_db)-7619 (106,829,540)-8059 (1)-6309 (2,024)-8317 (5)]TJ +T* +[(_213_ja)20 (v)25 (a)0 (c)-7054 (34,380,756)-7559 (50)-5809 (39,949)-7817 (61)]TJ +T* +[(_222_mpe)15 (gaudio)-6445 (22,813)-7559 (10)-6309 (2,620)-8317 (5)]TJ +T* +[(_227_mtrt)-7952 (1,424,925)-8059 (1)-6309 (2,397)-8317 (3)]TJ +T* +[(_228_jack)-7453 (23,851,600)-8059 (8)-6309 (2,979)-8317 (5)]TJ +T* +[(_224_richards)-7648 (70,560)-7559 (59)-6309 (3,434)-7817 (52)]TJ +T* +[(_233_tmix)-7785 (8,711,428)-6309 (1,612)-4559 (2,183,531)-6567 (1,730)]TJ +T* +[(SwingMark)-7397 (4,062,787)-6309 (2,508)-5309 (465,758)-6567 (1,977)]TJ +T* +[(v)20 (olano serv)15 (er)-6739 (9,622,570)-7059 (587)-5309 (209,341)-7317 (542)]TJ +T* +[(v)20 (olano client)-6945 (9,495,680)-8059 (6)-5809 (17,625)-8317 (3)]TJ +T* +[(JWS)-10230 (1,783,691)-6309 (7,800)-5309 (162,586)-6567 (2,743)]TJ +ET +0 G +0 J +0 j +0.6 w +3.86 M +[]0 d +94.23 296.25 m +94.23 550.75 l +B* +172.83 295.75 m +172.83 567.75 l +B* +258.42 295.75 m +258.42 551.25 l +B* +344.01 295.75 m +344.01 568.25 l +B* +429.6 295.75 m +429.6 551.25 l +B* +517.77 296.25 m +517.77 567.75 l +B* +518.02 568 m +172.58 568 l +B* +518.02 551 m +93.98 551 l +B* +517.52 535.25 m +94.48 535.25 l +B* +517.52 532.75 m +94.48 532.75 l +B* +517.52 518.25 m +94.48 518.25 l +B* +517.52 515.75 m +94.48 515.75 l +B* +518.02 500 m +93.98 500 l +B* +518.02 483 m +93.98 483 l +B* +518.02 466 m +93.98 466 l +B* +518.02 449 m +93.98 449 l +B* +518.02 432 m +93.98 432 l +B* +518.02 415 m +93.98 415 l +B* +517.52 399.25 m +94.48 399.25 l +B* +517.52 396.75 m +94.48 396.75 l +B* +518.02 381 m +93.98 381 l +B* +518.02 364 m +93.98 364 l +B* +518.02 347 m +93.98 347 l +B* +518.02 330 m +93.98 330 l +B* +518.02 313 m +93.98 313 l +B* +518.02 296 m +93.98 296 l +B* +PDFVars/TermAll get exec end end +userdict /pgsave get restore +showpage +%%PageTrailer +%%EndPage +%%Page: 26 26 +%%BeginPageSetup +userdict /pgsave save put +PDFVars begin PDF begin PDFVars/InitAll get exec +%%EndPageSetup +0 0 612 792 RC +1 g +/N93 /ExtGState findRes gs +1 i +0 0 612 792 re +f +BT +/N92 1 Tf +12 0 0 12 300 34.17 Tm +0 g +0 Tc +(23)Tj +-19 36.2892 TD +[(This)-369 (gi)25 (v)15 (e)0 (s)-369 (u)0 (s)-370 (a)-369 (total)-370 (of)-369 (11)-369 (instructions)-370 (for)-370 (the)-370 (f)11 (ast)-370 (paths)-370 (of)-369 (meta-lock)-370 (acquisition)-369 (and)-370 (release.)]TJ +0 -1.1667 TD +[(While)-327 (a)-327 (careful)-326 (analysis)-327 (of)-327 (the)-326 (c)15 (ycles)-327 (consumed)-326 (by)-327 (an)-327 (optimal)-327 (implementation)-326 (is)-327 (interesting)-326 (in)]TJ +T* +[(the)-263 (conte)15 (xt)-263 (of)-262 (a)-263 (particular)-263 (architecture,)-262 (for)-263 (the)-262 (present)-263 (purposes)-262 (we)-263 (shall)-263 (be)-263 (satis\336ed)-262 (with)-263 (consid-)]TJ +T* +[(ering)-413 (the)-414 (SP)92 (ARC)-414 (implementation)-413 (representati)25 (v)15 (e)-413 (of)-414 (a)-414 (typical)-414 (RISC)-414 (implementation.)-413 (The)-413 (most)]TJ +T* +[(costly)-427 (instructions)-427 (are)-426 (the)-427 (tw)10 (o)-427 (atomic)-426 (instructions,)]TJ +/N95 1 Tf +21.6158 0 TD +(swap)Tj +/N92 1 Tf +2.8267 0 TD +(in)Tj +/N95 1 Tf +1.205 0 TD +(getMetaLock\(\))Tj +/N92 1 Tf +7.8 0 TD +[(,)-427 (and)]TJ +/N95 1 Tf +2.5475 0 TD +(cas)Tj +/N92 1 Tf +2.2267 0 TD +(in)Tj +/N95 1 Tf +-38.2217 -1.1667 TD +(releaseMetaLock\(\))Tj +/N92 1 Tf +10.2 0 TD +0 Tw +(.)Tj +-10.2 -1.75 TD +[(No)25 (w)-276 (consider)-275 (the)-276 (performance)-276 (of)-275 (the)-276 (system)-275 (with)-276 (e)15 (xtra)-276 (f)10 (ast)-276 (synchronization)-276 (enabled.)-276 (Recall)-276 (that)]TJ +0 -1.1667 TD +[(this)-323 (optimization)-323 (fuses)-323 (meta-locking)-323 (and)-323 (monitor)20 (-locking)-323 (to)-324 (allo)25 (w)-323 (monitor)21 (-lock)-323 (acquisition)-323 (and)]TJ +T* +[(release)-326 (each)-326 (with)-326 (a)-326 (single)-326 (atomic)-326 (instruction)-326 (in)-326 (uncontended)-326 (cases,)-326 (b)20 (u)0 (t)-326 (i)0 (n)-326 (contended)-326 (cases)-327 (f)11 (alls)]TJ +T* +[(back)-229 (to)-229 (the)-230 (meta-lock)-229 (protocol)-229 (for)-230 (a)-229 (total)-229 (cost)-230 (of)-229 (three)-229 (atomic)-230 (instructions.)-229 (If)-229 (monitor)20 (-lock)-229 (conten-)]TJ +T* +[(tion)-261 (is)-261 (rare,)-262 (as)-261 (Bacon)]TJ +/N94 1 Tf +8.7225 0 TD +[(et)-261 (al)]TJ +/N92 1 Tf +1.7617 0 TD +[(.)70 (\325)56 (s)-262 (data)-261 (indicate)-261 ([3],)-261 (this)-261 (will)-261 (be)-262 (a)-261 (net)-262 (win;)-261 (otherwise,)-261 (it)-261 (could)-262 (be)-261 (a)-262 (loss.)]TJ +-10.4842 -1.1667 TD +[(T)80 (able)-250 (1)-402 (sho)25 (ws)-401 (the)-401 (bottom)-402 (line)-401 (on)-402 (e)15 (xtra)-402 (f)10 (ast)-401 (synchronization)-402 (for)-401 (our)-401 (benchmarks:)-401 (no)-401 (program)]TJ +T* +[(slo)25 (ws)-354 (do)25 (wn,)-353 (and)-353 (se)25 (v)15 (eral)-353 (speed)-354 (up)-353 (signi\336cantly)65 (.)-353 (Comparing)-353 (the)-354 (left)-354 (and)-353 (right)-353 (halv)15 (es)-354 (of)-354 (T)80 (able)-250 (3)]TJ +T* +[(sho)25 (ws)-276 (that)-276 (the)-276 (speedup)-276 (results)-276 (from)-276 (a)-276 (signi\336cant)-276 (reduction)-275 (in)-276 (the)-276 (number)-276 (of)-276 (meta-locking)-275 (opera-)]TJ +T* +[(tions,)-313 (con\336rming)-313 (that)-313 (monitor)20 (-lock)-313 (contention)-314 (is)-313 (indeed)-314 (rare.)-313 (Ho)25 (we)25 (v)15 (e)0 (r)40 (,)-313 (T)80 (able)-250 (3)-313 (also)-314 (sho)25 (ws)-313 (that,)]TJ +T* +[(for)-433 (some)-432 (programs,)-433 (the)-433 (f)10 (all-back)-433 (case)-433 (is)-433 (suf)25 (\336ciently)-433 (frequent)-433 (that)-432 (its)-432 (performance)-433 (cannot)-433 (be)]TJ +T* +[(ne)15 (glected.)-263 (Finally)65 (,)-264 (the)-264 (similar)-264 (number)-263 (of)-264 (slo)25 (w)-264 (meta-lock)-264 (operations)-263 (in)-264 (the)-264 (left)-264 (and)-263 (right)-264 (halv)15 (es)-264 (of)]TJ +T* +[(T)80 (able)-250 (3)-279 (implies)-279 (that)-278 (e)15 (xtra)-279 (f)10 (ast)-279 (synchronization)-278 (does)-279 (not)-279 (reduce)-279 (contention)-279 (on)-279 (the)-279 (meta-lock.)-279 (\(F)16 (or)]TJ +T* +[(completeness,)-286 (we)-286 (should)-286 (mention)-285 (that)-286 (a)-286 (f)0 (e)25 (w)-285 (of)-286 (the)-285 (meta-lock)-286 (operations)-286 (that)-286 (remain)-285 (when)-286 (using)]TJ +T* +[(e)15 (xtra)-304 (f)10 (ast)-304 (synchronization)-304 (result)-304 (from)-303 (layers)-304 (in)-304 (EVM,)-304 (such)-304 (as)-304 (class)-304 (loading)-304 (and)-303 (JNI,)-304 (that)-304 (do)-303 (not)]TJ +T* +[(use the e)15 (xtra f)11 (ast operations.\))]TJ +/N91 1 Tf +16 0 0 16 72 147.98 Tm +[(8 Conclusions and futur)18 (e w)10 (ork)]TJ +/N92 1 Tf +12 0 0 12 72 124.64 Tm +[(W)80 (e)-251 (ha)20 (v)15 (e)-251 (presented)-250 (a)-251 (meta-locking)-251 (algorithm)-251 (that)-251 (supports)-251 (a)-251 (v)25 (ariety)-251 (of)-250 (higher)20 (-le)25 (v)15 (e)0 (l)-251 (locking)-251 (proto-)]TJ +T* +[(cols,)-319 (by)-319 (pro)15 (viding)-319 (e)15 (xclusi)25 (v)15 (e)-319 (access)-319 (to)-319 (the)-320 (data)-319 (structures)-319 (used)-319 (in)-319 (the)-319 (higher)20 (-le)25 (v)15 (e)0 (l)-320 (protocol.)-318 (This)]TJ +T* +[(meta-locking)-258 (algorithm)-257 (has)-258 (se)26 (v)15 (eral)-258 (virtues.)-258 (Lik)10 (e)-258 (the)-258 (HotSpot)-257 (system)-258 (that)-257 (introduced)-258 (header)-257 (w)10 (ord)]TJ +T* +[(displacement,)-349 (the)-350 (meta-locking)-349 (algorithm)-350 (is)-350 (highly)-351 (space-ef)25 (\336cient,)-350 (requiring)-350 (only)-350 (tw)10 (o)-350 (reserv)16 (ed)]TJ +T* +[(bits)-263 (in)-264 (each)-264 (object,)-263 (and)-263 (a)-264 (number)-263 (of)-264 (lock)-263 (records)-264 (that)-264 (is)-263 (small)-264 (for)-263 (normal)-264 (programs.)-264 (It)-263 (is)-264 (also)-263 (rea-)]TJ +5.2808 37.3217 TD +[(Figure 11. F)16 (ast path for a synchronization operation wrapped)]TJ +7.38 -1 TD +(in meta-lock and unlock)Tj +/N95 1 Tf +8 0 0 8 99 711.17 Tm +(! getMetaLock)Tj +0 -1.25 TD +(or %i0, 3, %l0 ! %l0 = my busy value)Tj +T* +(add %i1, 4, %l1 ! %l1 = multi-use word address)Tj +T* +(swap [%l1], %l0 ! Swap out busy value)Tj +T* +(and %l0, 3, %l2 ! %l2 = meta-lock state)Tj +T* +(cmp %l2, 3 ! Is lock state busy?)Tj +T* +(beq slowGetMetaLockPath)Tj +T* +(sub %l0, %l2, %l0 ! In delay slot compute high 30 bits)Tj +19.2 -1.25 TD +(! Slow path gets predecessor EE in %l0)Tj +T* +(! and synch operation gets lock)Tj +T* +(! record pointer or age&hash in %l0)Tj +-19.2 -1.25 TD +(... %l2 = body of synchronization operation)Tj +T* +(! releaseMetaLock)Tj +T* +(or %i0, 3, %l0 ! %l0 = my busy value)Tj +T* +(cas [%l1], %l0, %l2 ! if [%l1] == %l0 then swap\([%l0],%l2\)\ +)Tj +T* +(cmp %l0, %l2 ! did we do the swap?)Tj +T* +(bne slowReleaseMetaLockPath)Tj +T* +(! unfilled delay slot here)Tj +ET +PDFVars/TermAll get exec end end +userdict /pgsave get restore +showpage +%%PageTrailer +%%EndPage +%%Page: 27 27 +%%BeginPageSetup +userdict /pgsave save put +PDFVars begin PDF begin PDFVars/InitAll get exec +%%EndPageSetup +0 0 612 792 RC +1 g +/N93 /ExtGState findRes gs +1 i +0 0 612 792 re +f +BT +/N92 1 Tf +12 0 0 12 300 34.17 Tm +0 g +0 Tc +(24)Tj +-19 56.4858 TD +[(sonably)-270 (time-ef)25 (\336cient)-270 (in)-269 (the)-270 (normal)-270 (case,)-270 (requiring)-270 (7)-270 (instructions)-269 (to)-270 (acquire)-269 (and)-270 (4)-270 (instructions)-269 (to)]TJ +0 -1.1667 TD +[(release)-454 (an)-454 (uncontended)-454 (meta-lock.)-454 (Each)-454 (of)-454 (those)-454 (paths)-454 (includes)-454 (a)-454 (single)-454 (atomic)-454 (instruction.)]TJ +T* +[(Finally)65 (,)-342 (i)0 (t)-343 (i)0 (s)-343 (careful)-343 (to)-343 (a)20 (v)20 (oid)-343 (pathologies)-343 (when)-343 (there)-343 (is)-343 (contention:)-343 (the)-343 (algorithm)-342 (introduces)-343 (no)]TJ +T* +0 Tw +[(b)20 (usy-w)10 (aiting, and only v)15 (ery rarely allocates from global memory)66 (.)]TJ +0 -1.75 TD +[(W)80 (e)-397 (ha)20 (v)15 (e)-397 (also)-397 (presented)-397 (a)-398 (particular)-397 (higher)20 (-le)25 (v)15 (e)0 (l)-397 (locking)-397 (protocol,)-397 (based)-397 (on)-397 (this)-397 (meta-locking)]TJ +0 -1.1667 TD +[(algorithm,)-267 (for)-267 (the)-267 (synchronization)-267 (primiti)26 (v)15 (e)0 (s)-268 (o)0 (f)-267 (the)-268 (Ja)21 (v)25 (a)-268 (virtual)-267 (machine.)-268 (An)-267 (optimization)-267 (of)-268 (this)]TJ +T* +[(protocol)-248 (gains)-249 (the)-249 (ef)25 (\336cienc)15 (y)-249 (o)0 (f)-249 (a)20 (v)20 (oiding)-248 (meta-locking)-249 (in)-249 (most)-249 (cases,)-248 (b)20 (u)0 (t)-249 (the)-248 (ability)-249 (to)-249 (f)10 (all)-248 (back)-249 (to)]TJ +T* +[(meta-locking in uncommon cases re)16 (gularizes and simpli\336es the protocol.)]TJ +0 -1.75 TD +[(Finally)65 (,)-274 (w)0 (e)-275 (h)0 (a)20 (v)15 (e)-274 (implemented)-274 (and)-275 (v)25 (alidated)-275 (the)-275 (performance)-274 (of)-274 (the)-275 (meta-lock)-274 (in)-275 (the)-275 (conte)15 (xt)-274 (of)-274 (a)]TJ +0 -1.1667 TD +[(high-performance)-427 (Ja)20 (v)25 (a)-428 (virtual)-427 (machine.)-428 (Our)-427 (measurements,)-428 (which)-428 (include)-428 (a)-428 (study)-428 (of)-428 (se)26 (v)15 (eral)]TJ +T* +[(multi-threaded)-257 (programs)-258 (running)-257 (on)-257 (a)-258 (4-CPU)-257 (system,)-258 (indicate)-257 (that)-257 (the)-257 (meta-lock)-257 (algorithm)-257 (oper-)]TJ +T* +[(ates)-323 (with)-323 (a)-323 (l)0 (o)26 (w)-323 (contention)-324 (rate)-323 (to)-323 (ensure)-323 (that)-323 (the)-323 (f)10 (ast)-323 (path)-323 (strongly)-323 (dominates)-323 (the)-323 (performance.)]TJ +T* +[(Synthetic)-366 (benchmarks)-366 (designed)-366 (to)-366 (isolate)-366 (the)-366 (cost)-366 (of)-366 (synchronization)-366 (indicate)-366 (that)-366 (our)-366 (scheme)]TJ +T* +[(outperforms the original monitor cache scheme by a f)11 (actor of three or more.)]TJ +0 -1.75 TD +[(In)-241 (the)-242 (future,)-241 (we)-241 (may)-241 (w)10 (ork)-241 (on)-242 (e)15 (xtending)-241 (the)-242 (e)16 (xtra)-242 (f)10 (ast)-241 (instruction)-241 (sequences)-242 (to)-241 (handle)-241 (more)-242 (cases)]TJ +0 -1.1667 TD +[(while)-309 (continuing)-308 (to)-309 (use)-309 (the)-309 (meta-lock)-309 (protocol)-308 (as)-309 (a)-309 (comfortable)-309 (f)10 (all-back.)-309 (F)14 (o)0 (r)-309 (e)15 (xample,)-308 (if)-309 (mea-)]TJ +T* +[(surements)-317 (justify)-317 (it,)-318 (e)15 (xtra)-317 (f)11 (ast)-318 (locking)-317 (could)-317 (be)-318 (e)15 (xtended)-317 (to)-318 (allo)25 (w)-317 (a)-318 (non-empty)-318 (queue)-318 (with)-318 (lock)]TJ +T* +(state)Tj +/N95 1 Tf +2.135 0 TD +(WAITERS)Tj +/N92 1 Tf +4.2 0 TD +[(.)-302 (W)81 (e)-303 (are)-301 (also)-302 (in)40 (v)15 (estigating)-302 (whether)-302 (the)-302 (high-le)26 (v)15 (e)0 (l)-302 (synchronization)-302 (state)-302 (could)-302 (be)]TJ +-6.335 -1.1667 TD +[(made)-298 (to)-299 (in\337uence)-298 (the)-298 (order)-299 (in)-298 (which)-298 (threads)-299 (acquire)-298 (meta-locks.)-298 (F)15 (o)0 (r)-299 (e)16 (xample,)-298 (it)-299 (might)-298 (impro)15 (v)15 (e)]TJ +T* +[(ef)25 (\336cienc)14 (y)-257 (i)0 (f)-257 (a)-258 (thread)-258 (attempting)-258 (to)-258 (release)-257 (a)-258 (monitor)20 (-lock)-257 (could)-258 (be)-258 (gi)26 (v)14 (e)0 (n)-258 (preferential)-257 (treatment)-258 (at)]TJ +T* +[(the meta-lock le)25 (v)15 (el.)]TJ +/N94 1 Tf +0 -1.75 TD +[(Ac)20 (knowledgments.)]TJ +/N92 1 Tf +7.6825 0 TD +[(Lars)-232 (Bak,)-231 (Da)20 (vid)-232 (Dice,)-231 (Da)20 (vid)-232 (Holmes,)-231 (Urs)-232 (H\232lzle,)-232 (Doug)-231 (Lea,)-232 (and)-231 (Hong)-232 (Zhang)]TJ +-7.6825 -1.1667 TD +[(pro)15 (vided v)15 (ery useful comments on a draft of the paper)56 (.)]TJ +ET +PDFVars/TermAll get exec end end +userdict /pgsave get restore +showpage +%%PageTrailer +%%EndPage +%%Page: 28 28 +%%BeginPageSetup +userdict /pgsave save put +PDFVars begin PDF begin PDFVars/InitAll get exec +%%EndPageSetup +0 0 612 792 RC +1 g +/N93 /ExtGState findRes gs +1 i +0 0 612 792 re +f +BT +/N92 1 Tf +12 0 0 12 300 34.17 Tm +0 g +0 Tc +(25)Tj +/N91 1 Tf +16 0 0 16 72 709.33 Tm +[(Refer)18 (ences)]TJ +/N92 1 Tf +12 0 0 12 72 686 Tm +[(1.)-750 (Tom)-231 (Anderson.)-232 (The)-231 (Performance)-231 (of)-232 (Spin)-231 (Lock)-232 (Alternatives)-231 (for)-231 (Shared-Memory)-231 (Multiproces-)]TJ +1.44 -1.1667 TD +(sors.)Tj +/N94 1 Tf +2.1108 0 TD +0 Tw +(IEEE Transactions on Parallel and Distributed Systems,)Tj +/N92 1 Tf +22.8892 0 TD +(1\(1\), p. 6-16, January 1990.)Tj +-26.44 -1.4167 TD +[(2.)-750 (Ken)-341 (Arnold)-340 (and)-341 (James)-341 (Gosling.)]TJ +/N94 1 Tf +15.0075 0 TD +[(The)-341 (Java)-340 (Programming)-340 (Language)]TJ +/N92 1 Tf +14.0208 0 TD +[(.)-341 (The)-341 (Java)-340 (Series,)-340 (Addi-)]TJ +-27.5883 -1.1667 TD +(son-Wesley, 1996.)Tj +-1.44 -1.4167 TD +[(3.)-750 (David)-392 (F.)-391 (Bacon,)-392 (Ravi)-391 (Konuru,)-391 (Chet)-392 (Murthy,)-391 (and)-392 (Mauricio)-391 (Serrano.)-391 (Thin)-391 (Locks:)-392 (Feather-)]TJ +1.44 -1.1667 TD +[(weight)-309 (Synchronization)-309 (for)-309 (Java.)-309 (In)]TJ +/N94 1 Tf +14.8475 0 TD +[(Proc.)-308 (ACM)-309 (SIGPLAN)-309 (\32498)-309 (Conference)-309 (on)-309 (Programming)]TJ +-14.8475 -1.1667 TD +(Language Design and Implementation \(PLDI\))Tj +/N92 1 Tf +18.4983 0 TD +(, p. 258-268, Montreal, Canada, June 1998.)Tj +-19.9383 -1.4167 TD +[(4.)-750 (Lars)-245 (Bak,)-244 (presentation)-244 (on)-244 (the)-245 (HotSpot)-244 (JVM,)-245 (Panel:)-244 (The)-244 (New)-244 (Crop)-245 (Of)-244 (Java)-245 (Virtual)-244 (Machines.)]TJ +1.44 -1.1667 TD +(In)Tj +/N94 1 Tf +1.2342 0 TD +[(Proc.)-401 (ACM)-401 (SIGPLAN)-401 (\32498)-401 (Conference)-401 (on)-401 (Object-Oriented)-401 (Programming)-400 (Systems,)-400 (Lan-)]TJ +-1.2342 -1.1667 TD +(guages, and Applications)Tj +/N92 1 Tf +10.1392 0 TD +(, p. 179-182, Vancouver, Canada, October 1998.)Tj +-11.5792 -1.4167 TD +[(5.)-750 (Andrew)-389 (Birrell.)]TJ +/N94 1 Tf +8.36 0 TD +[(An)-389 (Introduction)-389 (to)-389 (Programming)-389 (with)-389 (Threads)]TJ +/N92 1 Tf +19.4458 0 TD +[(.)-389 (Digital)-389 (Systems)-389 (Research)]TJ +-26.3658 -1.1667 TD +(Center report no. 35, 1989.)Tj +-1.44 -1.4167 TD +[(6.)-750 (David)-226 (R.)-226 (Butenhof.)]TJ +/N94 1 Tf +9.5108 0 TD +[(Programming)-226 (with)-225 (POSIX)]TJ +9.6 0 0 9.6 312.89 507.8 Tm +(\250)Tj +/N92 1 Tf +12 0 0 12 322.9 503 Tm +[(Threads.)-225 (Addison-Wesley)-225 (Professional)-226 (Com-)]TJ +-19.4683 -1.1667 TD +(puting Series, 1997.)Tj +-1.44 -1.4167 TD +[(7.)-750 (Peter)-269 (A.)-268 (Buhr,)-269 (Michel)-269 (Fortier,)-269 (and)-268 (Michael)-269 (H.)-269 (Coffin.)-269 (Monitor)-268 (Classification.)]TJ +/N94 1 Tf +33.12 0 TD +[(ACM)-269 (Comput-)]TJ +-31.68 -1.1667 TD +(ing Surveys)Tj +/N92 1 Tf +4.6383 0 TD +(, 27\(1\), p. 63-107, March 1995.)Tj +-6.0783 -1.4167 TD +[(8.)-750 (Sylvia)-387 (Dieckmann)-386 (and)-387 (Urs)-387 (H\232lzle.)]TJ +/N94 1 Tf +16.405 0 TD +[(A)-387 (Study)-386 (of)-387 (the)-387 (Allocation)-387 (Behavior)-387 (of)-387 (the)-386 (SPECjvm98)]TJ +-14.965 -1.1667 TD +[(Java)-304 (Benchmarks)]TJ +/N92 1 Tf +7.1342 0 TD +[(.)-303 (Technical)-303 (Report)-303 (TRCS98-33,)-303 (Computer)-303 (Science)-303 (Department,)-304 (University)]TJ +-7.1342 -1.1667 TD +(of California, Santa Barbara, December 1998.)Tj +-1.44 -1.4167 TD +[(9.)-750 (Edsgar)-233 (Dijkstra.)-234 (Solution)-233 (of)-234 (a)-233 (Problem)-233 (in)-234 (Concurrent)-233 (Programming)-233 (Control.)]TJ +/N94 1 Tf +32.2217 0 TD +(Communications)Tj +-30.7817 -1.1667 TD +[(of the A)30 (C)0 (M)]TJ +/N92 1 Tf +4.5808 0 TD +( 8\(9\), p. 569, August 1965.)Tj +-6.0208 -1.4167 TD +[(10.)-250 (Michael)-324 (Greenwald)-324 (and)-325 (David)-325 (Cheriton.)-325 (The)-325 (Synergy)-324 (Between)-325 (Non-blocking)-325 (Synchroniza-)]TJ +1.44 -1.1667 TD +[(tion)-350 (and)-350 (Operating)-350 (System)-351 (Structure.)-350 (In)]TJ +/N94 1 Tf +16.7942 0 TD +[(2nd)-350 (Symposium)-350 (on)-350 (Operating)-350 (Systems)-350 (Design)-350 (and)]TJ +-16.7942 -1.1667 TD +(Implementation \(OSDI \32596\))Tj +/N92 1 Tf +11.0533 0 TD +(, p. 123-136, Seattle, WA, October 1996.)Tj +-12.4933 -1.4167 TD +[(11.)-250 (Per)-239 (Brinch)-239 (Hansen.)-239 (Monitors)-239 (and)-239 (Concurrent)-238 (Pascal:)-239 (a)-239 (personal)-238 (history.)-238 (In)]TJ +/N94 1 Tf +31.5683 0 TD +[(Pr)45 (oceedings)-238 (of)-239 (the)]TJ +-30.1283 -1.1667 TD +[(Second)-396 (A)30 (C)0 (M)-397 (SIGPLAN)-396 (Confer)37 (ence)-396 (on)-396 (History)-397 (of)-396 (Pr)45 (o)10 (g)0 (r)15 (amming)-395 (Langua)9 (g)10 (e)0 (s)]TJ +/N92 1 Tf +31.3992 0 TD +[(,)-397 (p)0 (.)-396 (1-35.)-396 (Pub-)]TJ +-31.3992 -1.1667 TD +(lished as)Tj +/N94 1 Tf +3.7217 0 TD +[(A)29 (CM SIGPLAN Notices)]TJ +/N92 1 Tf +9.5817 0 TD +( 28\(3\), March 1993.)Tj +-14.7433 -1.4167 TD +[(12.)-250 (C.)-252 (A.)-253 (R.)-252 (Hoare.)-252 (Monitors:)-252 (An)-253 (Operating)-252 (System)-252 (Structuring)-252 (Concept.)]TJ +/N94 1 Tf +29.7175 0 TD +[(Communications)-252 (of)-253 (the)]TJ +-28.2775 -1.1667 TD +-0.03 Tc +[(AC)-30 (M)]TJ +/N92 1 Tf +2.0808 0 TD +0 Tc +( 17\(10\), p. 549-557, October 1974.)Tj +-3.5208 -1.4167 TD +[(13.)-250 (Andreas)-355 (Krall.)-355 (Efficient)-355 (JavaVM)-355 (Just-in-Time)-355 (Compilation.)-355 (In)]TJ +/N94 1 Tf +27.7067 0 TD +[(Proc.)-355 (International)-355 (Confer-)]TJ +-26.2667 -1.1667 TD +[(ence)-473 (on)-472 (Parallel)-473 (Architectures)-473 (and)-472 (Compilation)-473 (Techniques)-472 (\(PACT\32598\),)]TJ +/N92 1 Tf +31.0308 0 TD +[(p.)-473 (12-18.)-473 (Paris,)]TJ +-31.0308 -1.1667 TD +(France, October 1998.)Tj +-1.44 -1.4167 TD +[(14.)-250 (Andreas)-227 (Krall)-227 (and)-227 (Mark)-227 (Probst.)-227 (Monitors)-227 (and)-228 (Exceptions:)-227 (How)-227 (to)-228 (implement)-227 (Java)-227 (efficiently.)]TJ +1.44 -1.1667 TD +(In)Tj +/N94 1 Tf +1.0725 0 TD +[(ACM)-240 (1998)-240 (Workshop)-240 (on)-240 (Java)-239 (for)-240 (High-Performance)-239 (Computing)]TJ +/N92 1 Tf +25.8992 0 TD +[(,)-240 (p)0 (.)-240 (15-24,)-239 (Palo)-240 (Alto,)-239 (Cali-)]TJ +-26.9717 -1.1667 TD +(fornia, March 1998.)Tj +-1.44 -1.4167 TD +[(15.)-250 (Leslie)-275 (Lamport.)-276 (Proving)-275 (the)-275 (Correctness)-275 (of)-275 (Multiprocess)-275 (Programs.)]TJ +/N94 1 Tf +29.09 0 TD +[(IEEE)-275 (Trans.)-276 (Softw.)-275 (Eng.)]TJ +/N92 1 Tf +-27.65 -1.1667 TD +(SE-3, 2, p. 125-143, March 1977.)Tj +-1.44 -1.4167 TD +[(16.)-250 (Leslie)-302 (Lamport.)-302 (A)-302 (Fast)-302 (Mutual)-302 (Exclusion)-302 (Algorithm.)]TJ +/N94 1 Tf +23.4467 0 TD +[(A)29 (C)0 (M)-302 (T)55 (r)15 (ansactions)-302 (on)-302 (Computing)-302 (Sys-)]TJ +-22.0067 -1.1667 TD +(tem)Tj +/N92 1 Tf +1.4442 0 TD +( 5\(1\), p. 1-11, February 1987.)Tj +-2.8842 -1.4167 TD +[(17.)-250 (Xavier)-926 (Leroy.)]TJ +/N94 1 Tf +8.71 0 TD +[(The)-926 (LinuxThreads)-925 (library)]TJ +/N92 1 Tf +11.685 0 TD +[(.)-926 (http://pauillac.inria.fr/~xleroy/linuxthreads/)]TJ +-18.955 -1.1667 TD +(index.html, 1997.)Tj +ET +PDFVars/TermAll get exec end end +userdict /pgsave get restore +showpage +%%PageTrailer +%%EndPage +%%Page: 29 29 +%%BeginPageSetup +userdict /pgsave save put +PDFVars begin PDF begin PDFVars/InitAll get exec +%%EndPageSetup +0 0 612 792 RC +1 g +/N93 /ExtGState findRes gs +1 i +0 0 612 792 re +f +BT +/N92 1 Tf +12 0 0 12 300 34.17 Tm +0 g +0 Tc +(26)Tj +-19 56.4858 TD +[(18.)-250 (Timothy)-460 (Lindholm)-460 (and)-459 (Frank)-459 (Yellin.)]TJ +/N94 1 Tf +17.7158 0 TD +[(The)-460 (Java)-460 (Virtual)-459 (Machine)-460 (Specification)]TJ +/N92 1 Tf +16.7825 0 TD +[(.)-460 (The)-460 (Java)]TJ +-33.0583 -1.1667 TD +0 Tw +(Series, Addison-Wesley, 1996.)Tj +-1.44 -1.4167 TD +[(19.)-250 (Peter)-287 (Magnusson,)-287 (Anders)-287 (Landin,)-287 (and)-287 (Erik)-286 (Hagersten.)]TJ +/N94 1 Tf +23.865 0 TD +[(Ef)18 (\336cient)-288 (Softwar)37 (e)-287 (Sync)16 (hr)45 (onization)-287 (on)]TJ +-22.425 -1.1667 TD +[(Lar)37 (g)10 (e)-316 (Cac)15 (he)-316 (Coher)38 (ent)-316 (Multipr)45 (ocessor)10 (s.)]TJ +/N92 1 Tf +16.5258 0 TD +[(SICS)-316 (Research)-316 (Report)-316 (T94:07,)-316 (Swedish)-316 (Institute)-315 (of)]TJ +-16.5258 -1.1667 TD +(Computer Science, Box 1263, S-164 28 Kista, Sweden, February 1994.)Tj +-1.44 -1.4167 TD +[(20.)-250 (John)-260 (M.)-261 (Mellor-Crummey)-260 (and)-260 (Michael)-260 (L.)-261 (Scott.)-260 (Algorithms)-260 (for)-260 (Scalable)-260 (Synchronization)-260 (on)]TJ +1.44 -1.1667 TD +[(Shared-Memory)-368 (Multiprocessors.)]TJ +/N94 1 Tf +13.9842 0 TD +[(ACM)-368 (Transactions)-368 (on)-368 (Computer)-368 (Systems,)]TJ +/N92 1 Tf +17.5908 0 TD +[(9\(1\),)-368 (p.)-367 (21-65,)]TJ +-31.575 -1.1667 TD +(1991.)Tj +-1.44 -1.4167 TD +[(21.)-250 (Gilles)-302 (Muller,)-301 (B\207rbara)-301 (Moura,)-301 (Fabrice)-302 (Bellard,)-301 (and)-302 (Charles)-301 (Consel.)-302 (Harissa:)-301 (A)-301 (Flexible)-301 (and)]TJ +1.44 -1.1667 TD +[(Efficient)-236 (Java)-235 (Environment)-235 (Mixing)-235 (Bytecode)-236 (and)-235 (Compiled)-236 (Code.)-236 (In)]TJ +/N94 1 Tf +27.9233 0 TD +[(Proc.)-236 (of)-235 (the)-235 (3rd)-236 (Confer-)]TJ +-27.9233 -1.1667 TD +[(ence)-310 (on)-309 (Object-Oriented)-309 (Technologies)-309 (and)-309 (Systems)-310 (\(COOTS\),)]TJ +/N92 1 Tf +25.6342 0 TD +[(p.)-309 (1-20,)-309 (Berkeley,)-309 (California,)]TJ +-25.6342 -1.1667 TD +(June 1997.)Tj +-1.44 -1.4167 TD +[(22.)-250 (John)-439 (Neffinger.)-439 (Which)-440 (Java)-439 (VM)-440 (scales)-439 (best?)]TJ +/N94 1 Tf +21.0992 0 TD +(JavaWorld,)Tj +/N92 1 Tf +5.0775 0 TD +[(August)-440 (1998.)-439 (http://www.java-)]TJ +-24.7367 -1.1667 TD +(world.com/javaworld/jw-08-1998/jw-08-volanomark.html. See also www.volan\ +o.com.)Tj +-1.44 -1.4167 TD +[(23.)-250 (Susan)-439 (Owicki)-439 (and)-439 (Leslie)-439 (Lamport.)-439 (Proving)-439 (Liveness)-440 (Properties)-439 (of)-439 (Concurrent)-439 (Programs,)]TJ +/N94 1 Tf +1.44 -1.1667 TD +(ACM Trans. Program. Lang. Syst.)Tj +/N92 1 Tf +13.9733 0 TD +(4, 3, p. 445-495, July 1982.)Tj +-15.4133 -1.4167 TD +[(24.)-250 (Todd)-234 (A.)-235 (Proebsting,)-233 (Gregg)-234 (Townsend,)-234 (Patrick)-234 (Bridges,)-234 (John)-234 (H.)-234 (Hartman,)-234 (Tim)-234 (Newsham,)-234 (and)]TJ +1.44 -1.1667 TD +[(Scott)-336 (A.)-336 (Watterson.)]TJ +/N94 1 Tf +8.3967 0 TD +[(Toba:)-336 (Java)-336 (For)-337 (Applications\321A)-336 (Way)-336 (Ahead)-336 (of)-336 (Time)-337 (\(WAT\))-336 (Compiler)]TJ +/N92 1 Tf +28.9133 0 TD +(.)Tj +-37.31 -1.1667 TD +(Technical Report, Dept. of Computer Science, University of Arizona, Tucs\ +on, 1997.)Tj +-1.44 -1.4167 TD +[(25.)-250 (Ali-Reza)-331 (Adl-Tabatabai,)-330 (Michal)-331 (Cierniak,)-332 (Guei-Yuan)-331 (Lueh,)-332 (Vishesh)-330 (M.)-332 (Parakh,)-331 (and)-331 (James)]TJ +1.44 -1.1667 TD +[(M.)-415 (Stichnoth.)-415 (Fast,)-415 (Effective)-415 (Code)-416 (Generation)-414 (in)-415 (a)-415 (Just-In-Time)-415 (Java)-415 (Compiler.)-415 (In)]TJ +/N94 1 Tf +35.3658 0 TD +(Proc.)Tj +-35.3658 -1.1667 TD +[(ACM)-476 (SIGPLAN)-476 (\32498)-476 (Conference)-476 (on)-476 (Programming)-476 (Language)-476 (Design)-476 (and)-477 (Implementation)]TJ +T* +(\(PLDI\))Tj +/N92 1 Tf +2.8883 0 TD +(, p. 280-290, Montreal, Canada, June 1998.)Tj +-4.3283 -1.4167 TD +[(26.)-250 (SPECjvm98 Benchmarks. August 19, 1998 release. http://www.spec.org/osg/j\ +vm98.)]TJ +T* +[(27.)-250 (Sun)-353 (Microsystems,)-353 (Inc.)-354 (Java)-353 (2)-353 (on-line)-353 (documentation:)-353 (http://java.sun.com/products/jdk/1.2/)]TJ +1.44 -1.1667 TD +(docs/api/index.html.)Tj +ET +PDFVars/TermAll get exec end end +userdict /pgsave get restore +showpage +%%PageTrailer +%%EndPage +%%Page: 30 30 +%%BeginPageSetup +userdict /pgsave save put +PDFVars begin PDF begin PDFVars/InitAll get exec +%%EndPageSetup +0 0 612 792 RC +1 g +/N93 /ExtGState findRes gs +1 i +0 0 612 792 re +f +BT +/N92 1 Tf +12 0 0 12 300 34.17 Tm +0 g +0 Tc +(27)Tj +/N91 1 Tf +14 0 0 14 72 710.67 Tm +0 Tw +[(About the A)50 (uthors)]TJ +/N92 1 Tf +12 0 0 12 72 685 Tm +-0.004 Tc +[(Ole)-224 (Agesen)-225 (is)-224 (a)-223 (Senior)-225 (Staf)25 (f)-223 (Engineer)-225 (in)-223 (the)-224 (Ja)20 (v)25 (a)-224 (T)79 (opics)-225 (Group)-224 (in)-224 (Sun)-225 (Labs.)-224 (Pre)24 (viously)64 (,)-224 (h)0 (e)-224 (w)10 (ork)10 (ed)]TJ +0 -1.1667 TD +[(in)-333 (the)-332 (Kanban)-333 (group)-333 (and)-332 (the)-333 (Self)-333 (group,)-333 (also)-333 (in)-333 (Sun)-332 (Labs.)-333 (He)-332 (has)-333 (an)-332 (M.S.)-332 (de)15 (gree)-333 (from)-332 (Aarhus)]TJ +T* +[(Uni)24 (v)15 (ersity in Denmark, and M.S. and Ph.D. de)10 (grees from Stanford Uni)22 (v)14 (ersity in California.)]TJ +0 -1.9167 TD +0 Tc +[(Da)20 (v)15 (e)-392 (Detlefs)-392 (is)-392 (a)-392 (Staf)25 (f)-392 (Engineer)-392 (in)-391 (the)-392 (Ja)20 (v)25 (a)-392 (T)80 (opics)-392 (Group)-392 (in)-392 (Sun)-392 (Labs.)-392 (He)-391 (obtained)-392 (an)-392 (B.S.)]TJ +0 -1.1667 TD +[(de)15 (gree)-257 (from)-257 (MIT)74 (,)-258 (then)-257 (w)10 (ork)10 (ed)-257 (brie\337y)-257 (for)-257 (TR)55 (W)-257 (and)-258 (as)-257 (a)-257 (staf)25 (f)-258 (programmer)-257 (at)-257 (the)-257 (MIT)-257 (Laboratory)]TJ +T* +[(for)-295 (Computer)-294 (Science.)-295 (Fi)25 (v)15 (e)-294 (years)-295 (of)-294 (ef)25 (fort)-295 (yielded)-294 (a)-295 (Ph.D.)-294 (from)-294 (Carne)15 (gie)-295 (Mellon)-294 (in)-294 (1990.)-295 (Then)]TJ +T* +[(he)-229 (joined)-230 (the)-229 (Digital)-229 (Equipment)-230 (Corporation\325)55 (s)-229 (Systems)-229 (Research)-230 (Center)40 (,)-229 (w)10 (orking)-229 (on)-230 (garbage)-229 (col-)]TJ +T* +[(lection and program v)15 (eri\336cation until joining Sun in 1996.)]TJ +0 -1.9167 TD +[(Ale)15 (x)-367 (Garthw)10 (aite)-367 (recently)-367 (joined)-367 (Sun)-367 (Microsystems)-368 (Laboratories)-367 (in)-368 (Burlington,)-367 (Massachusetts,)]TJ +0 -1.1667 TD +[(where)-227 (he)-227 (w)10 (orks)-227 (with)-227 (the)-226 (Ja)20 (v)25 (a)-227 (T)80 (opics)-227 (Group.)-227 (His)-227 (research)-226 (interests)-227 (include)-227 (programming)-227 (language)]TJ +T* +[(implementation)-228 (and)-228 (runtime)-229 (system)-229 (design,)-229 (automatic)-228 (memory)-228 (management,)-229 (and)-228 (synchronization)]TJ +T* +[(techniques.)-290 (Ale)15 (x)-290 (i)0 (s)-290 (also)-291 (a)-290 (graduate)-290 (student)-291 (at)-290 (the)-290 (Uni)25 (v)15 (ersity)-290 (of)-290 (Pennsylv)25 (ania)-290 (w)10 (orking)-291 (with)-290 (Scott)]TJ +T* +[(Nettles.)-260 (He)-261 (is)-260 (currently)-260 (w)10 (orking)-261 (on)-261 (a)-260 (proposal)-260 (for)-261 (his)-260 (doctoral)-260 (thesis)-260 (which)-260 (in)40 (v)15 (estig)6 (ates)-261 (ho)25 (w)-261 (run-)]TJ +T* +[(time services in a Ja)20 (v)25 (a)0 ( VM may be safely implemented using the Ja)22 (v)25 (a)0 ( programming language.)]TJ +0 -1.9167 TD +[(Ross)-247 (Knippel)-247 (is)-247 (a)-247 (Senior)-247 (Staf)25 (f)-247 (Engineer)-247 (in)-247 (the)-247 (Ja)20 (v)25 (a)-247 (T)69 (echnology)-246 (Group)-247 (of)-247 (Solaris)-247 (Softw)10 (are.)-247 (Pre)25 (vi-)]TJ +0 -1.1667 TD +[(ously)65 (,)-284 (h)0 (e)-284 (w)10 (ork)10 (ed)-283 (at)-285 (both)-284 (Cray)-284 (Research)-283 (and)-284 (Cray)-284 (Computer)40 (,)-283 (d)0 (e)25 (v)15 (eloping)-284 (compilers)-284 (for)-284 (the)-284 (XMP)111 (,)]TJ +T* +[(Cray3, and Cray4. He has a B.S. de)16 (gree in Computer Science from the Uni)26 (v)15 (ersity of Minnesota.)]TJ +0 -1.9167 TD +[(Y)129 (.S.)-259 (Ramakrishna)-259 (is)-260 (a)-260 (Staf)25 (f)-259 (Engineer)-260 (with)-259 (the)-260 (Ja)21 (v)25 (a)-260 (T)70 (echnology)-260 (Group)-259 (at)-260 (Sun)-260 (Microsystems.)-259 (Pre-)]TJ +0 -1.1667 TD +[(viously)65 (,)-302 (h)0 (e)-301 (w)10 (ork)11 (ed)-302 (at)-301 (TIFR,)-301 (Bombay)65 (,)-302 (and)-302 (at)-301 (SUNY)130 (,)-302 (Ston)15 (y)-302 (Brook,)-302 (on)-301 (the)-302 (use)-301 (of)-301 (temporal)-301 (logics)]TJ +T* +[(and)-228 (model-checking)-227 (for)-228 (the)-228 (v)15 (eri\336cation)-227 (of)-228 (concurrent)-228 (systems.)-228 (He)-227 (has)-228 (a)-228 (B.T)70 (ech.)-228 (from)-228 (IIT)-228 (Kanpur)40 (,)]TJ +T* +(and an M.S. and Ph.D. from UC Santa Barbara.)Tj +0 -1.9167 TD +[(Derek)-262 (White)-262 (is)-261 (a)-262 (Staf)25 (f)-262 (Engineer)-261 (at)-262 (Sun)-261 (Microsystems)-261 (Laboratories)-262 (in)-262 (Burlington,)-262 (Massachusetts.)]TJ +0 -1.1667 TD +[(He)-329 (is)-329 (a)-328 (member)-329 (of)-329 (the)-329 (Ja)21 (v)25 (a)-329 (T)80 (opics)-329 (Group,)-329 (which)-328 (has)-329 (de)25 (v)15 (eloped)-329 (a)-328 (JVM)-329 (as)-329 (an)-329 (infrastructure)-329 (for)]TJ +T* +[(de)25 (v)15 (eloping)-347 (soft)-347 (real-time)-347 (and)-348 (scalable)-347 (garbage)-347 (collection)-347 (for)-347 (the)-347 (Ja)20 (v)26 (a)-348 (language.)-347 (The)-347 (group)-347 (has)]TJ +T* +[(also)-229 (studied)-229 (and)-229 (optimized)-230 (JVM)-229 (performance)-228 (as)-229 (a)-229 (whole.)-229 (Other)-230 (interests)-228 (include)-229 (performance)-229 (and)]TJ +T* +[(heap)-444 (analysis)-444 (tools,)-444 (and)-444 (incremental)-445 (de)26 (v)15 (elopment)-444 (en)40 (vironments.)-444 (Prior)-445 (to)-444 (joining)-444 (Sun,)-444 (Derek)]TJ +T* +[(w)10 (ork)10 (ed)-301 (for)-301 (eight)-301 (years)-300 (at)-301 (Apple)-301 (Computer)40 (,)-300 (Inc.,)-301 (where)-300 (he)-301 (w)10 (a)0 (s)-301 (responsible)-301 (for)-301 (the)-300 (Object)-301 (P)15 (ascal)]TJ +T* +[(compiler)-388 (and)-388 (w)10 (ork)10 (ed)-388 (on)-388 (the)-389 (Dylan)-388 (programming)-388 (language)-388 (runtime)-388 (and)-388 (de)25 (v)15 (elopment)-388 (en)40 (viron-)]TJ +T* +(ment.)Tj +ET +PDFVars/TermAll get exec end end +userdict /pgsave get restore +showpage +%%PageTrailer +%%EndPage +%%Trailer +%%EOF Index: ossp-pkg/sio/BRAINSTORM/splice.ps.gz RCS File: /v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/splice.ps.gz,v co -q -kk -p'1.1' '/v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/splice.ps.gz,v' | diff -u /dev/null - -L'ossp-pkg/sio/BRAINSTORM/splice.ps.gz' 2>/dev/null Binary files ossp-pkg/sio/BRAINSTORM/splice.ps.gz and - differ Index: ossp-pkg/sio/BRAINSTORM/splice.ps.gz.L RCS File: /v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/splice.ps.gz.L,v co -q -kk -p'1.1' '/v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/splice.ps.gz.L,v' | diff -u /dev/null - -L'ossp-pkg/sio/BRAINSTORM/splice.ps.gz.L' 2>/dev/null --- ossp-pkg/sio/BRAINSTORM/splice.ps.gz.L +++ - 2024-05-20 02:18:02.437274317 +0200 @@ -0,0 +1 @@ +ftp://ftp.bitmover.com/pub/splice.ps.gz Index: ossp-pkg/sio/BRAINSTORM/thesis.ps.L RCS File: /v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/thesis.ps.L,v co -q -kk -p'1.1' '/v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/thesis.ps.L,v' | diff -u /dev/null - -L'ossp-pkg/sio/BRAINSTORM/thesis.ps.L' 2>/dev/null --- ossp-pkg/sio/BRAINSTORM/thesis.ps.L +++ - 2024-05-20 02:18:02.439790568 +0200 @@ -0,0 +1 @@ +http://www.cs.cmu.edu/~jcb/ Index: ossp-pkg/sio/BRAINSTORM/thesis.ps.gz RCS File: /v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/thesis.ps.gz,v co -q -kk -p'1.1' '/v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/thesis.ps.gz,v' | diff -u /dev/null - -L'ossp-pkg/sio/BRAINSTORM/thesis.ps.gz' 2>/dev/null Binary files ossp-pkg/sio/BRAINSTORM/thesis.ps.gz and - differ Index: ossp-pkg/sio/BRAINSTORM/tr96-169.ps.L RCS File: /v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/tr96-169.ps.L,v co -q -kk -p'1.1' '/v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/tr96-169.ps.L,v' | diff -u /dev/null - -L'ossp-pkg/sio/BRAINSTORM/tr96-169.ps.L' 2>/dev/null --- ossp-pkg/sio/BRAINSTORM/tr96-169.ps.L +++ - 2024-05-20 02:18:02.446023060 +0200 @@ -0,0 +1 @@ +http://www.cs.cmu.edu/~jcb/ Index: ossp-pkg/sio/BRAINSTORM/tr96-169.ps.gz RCS File: /v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/tr96-169.ps.gz,v co -q -kk -p'1.1' '/v/ossp/cvs/ossp-pkg/sio/BRAINSTORM/tr96-169.ps.gz,v' | diff -u /dev/null - -L'ossp-pkg/sio/BRAINSTORM/tr96-169.ps.gz' 2>/dev/null Binary files ossp-pkg/sio/BRAINSTORM/tr96-169.ps.gz and - differ