Re: StringNg review request

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Fri, 17 Sep 2010 05:25:49 -0600

On 09/03/2010 09:21 AM, Kinkie wrote:

> You'll find the branch at lp:~kinkie/squid/stringng
> A few details on what to home on:
> src/SBuf,* : the SBuf class proper

The review for SBuf is below. The code is getting better overall, but
there are still so many corrections that I will need more time to get
the other classes reviewed.

Review is based on lp:~kinkie/squid/stringng r9471.

Original source code is prepended with "> ". The comments are below the
source code they refer to. A single "." comment is there just to
separate one piece of code from another.

For SBuf.h:

> #include "Debug.h"
> #include "RefCount.h"

Remove if not needed for SBuf.h.
If SBuf.cci or any other file needs these, it should include them.

> #include <string.h>

Remove if not needed for SBuf.h

> #include <iostream>

Use <iosfwd> instead if possible.

> #include <sstream>

Remove if not needed for SBuf.h. Does this need #define protection?

> #define SQUIDSBUFPRINT(s) s.length(),s.exportRawRef()

Missing parens around s.

> /**
> * Container for varioous SBuf class-wide statistics.

spelling

> * \note read operations and compares are NOT counted.

Remove. Many things are not counted.

> class SBufStats {
> public:

It would be nice to run source formatting script against the branch.

> u_int64_t alloc; ///number of allocation operations

Vague description: What is being allocated? SBuf? internal buffers?
Consider: "number of constructor calls"

Here and elsewhere, doxygen comments are missing "<" to refer to the
member on the left?

> u_int64_t live; ///number of free operations

See above.
Consider: "number of destructor calls"

> u_int64_t qset; ///number of assigns/appends without content copy
> u_int64_t sset; ///number of assigns/appends with content copy

I am not sure what the actual intent here is, but the description does
not match use. qset is used in clear() and other methods unrelated to
assigns/appends.

> u_int64_t qget; ///number of get-ops not requiring full data scan
> u_int64_t sget; ///number of get-ops requiring a full data scan/copy

It is not clear what "full data scan" is, why it is limited to get
operations, and why the descriptions are not exactly symmetric.

I did not review stats correctness because I do not know what most of
these members really mean.

> /**
> * String-Buffer hybrid class (UniversalBuffer)

Confusing to uninitiated.
Consider: A string or buffer.

> * Features: refcounted backing store, cheap copy and substringing
> * operations, cheap-ish append operations, adaptive memory management

Remove "cheap-ish append operations" because it is too vague to be
useful and because there are no special append optimizations for now. At
least I do not see them in this class and other classes should not
matter here.

Remove "adaptive memory management" buzzwords.

> * copy-on-write semantics to isolate change operations to each
insatnce..

spelling

> * See http://www.cplusplus.com/reference/string/string/ for
documentation about
> * that API.

Remove commercial site ad? Folks will find it if they need it.

> * Notice that users MUST NOT hold pointers to SBuf, as doing so
> * moots the whole point of using refcounted buffers in the first place.
> */

Remove as technically inaccurate. Besides, users MUST NOT do many things.

> class SBuf {
> public:
> typedef signed int size_type;

I think we should support 3GB offsets and such. Could be handy for
referencing memory cache areas without copying. Is there any reason to
limit this to int? How about using size_t or ssize_t?

> static const size_type npos = -1;

Use static_cast<> to avoid dependence on size_type being signed. See
std::string::npos.

> /// Flag used by search methods to define case sensitivity
> static const bool caseInsensitive = false;
> static const bool caseSensitive = true;

Convert to enum?

> /// Maximum size of a SBuf: 256Mb. Arbitrary, but should be sane.
> /// \note only checked when the SBuf needs to be extended, the
first alloc can exceed it
> static const size_type maxSize=0xfffffff;

Remove "the first alloc can exceed it". Let's be consistent. Besides,
the user does not control when things get allocated and does not know
what "first alloc" is.

> /** Default constructor. Creates a new empty SBuf.

Do not document that a default constructor is a default constructor.
Same for all other standard methods: constructors, destructors, operators.
Can you _create_ something _old_?

Replace with: /** Creates an empty SBuf.

> *
> * A SBuf can be empty, but can never be undefined. In other
words there is no equivalent
> * of NULL. defined/undefined semantics need to be implemented by
the caller.
> */

Remove: Avoid documenting what SBuf cannot be. If you insist on
discussing the non-existent APIs, move that discussion to the SBuf.dox file.

> /** Constructor: import c-style string
> *
> * Create a new SBuf containing a COPY of the contents of the
> * c-string
> * \param S the c string to be copied
> * \param Ssize the size of S. If it is SBuf::npos or unspecified,
> * S must be null-terminated

Remove Ssize parameter. You have too many sizes. If you want a
two-parameter conversion from c-string, define another constructor with
just two parameters. See std::string.

> * \param pos how many bytes to skip at the beginning of the c-string
> * \param n how many bytes to import into the SBuf. If it is
SBuf::npos
> * or unspecified, imports to end-of-cstring
> */
> explicit SBuf(const char *S, size_type Ssize = npos, size_type
pos = 0, size_type n=npos);

.

> /** Import a char[] into a SBuf, copying the data.
> *
> * Assign a c-style string to a SBuf.

The above two lines are saying essentially the same thing.

You may use "c-string" instead of "c-style string". It is standard and
shorter.
Do not use char[] if the argument is not exactly char[].

> * It is the caller's duty to free the imported string, if needed.
> * \param S the c string to be copied
> * \param Ssize the size of S. If it is SBuf::npos or unspecified,
> * S must be null-terminated
> * \param pos how many bytes to skip at the beginning of the c-string
> * \param n how many bytes to import into the SBuf. If it is
SBuf::npos
> * or unspecified, imports to end-of-cstring
> */
> SBuf& assign(char const *S, size_type Ssize = npos, size_type pos
= 0, size_type n=npos);

Again, remove Ssize parameter. It is not needed.

Please use "const char *" instead of "char const *" for consistency
sake. Same for a few more method declarations.

> /** clear the SBuf as if it was just created.
> *
> */
> void clear();

s/clear/reset/ in the description. Document whether the already
allocated memory stays allocated. That fact is important for
performance-sensitive caller code.

> /** Append operation
> *
> * Append the supplied SBuf to the current one; extend storage as
needed.

.

> * If the caller wishes to maintain a copy of the unappended-to
> * SBuf she must do so <b>before</b> appending.

Remove. The method is not const and that should be enough. Check other
methods for the same reminders.

> */
> SBuf& append(const SBuf & S);

.

> SBuf& append(const char * S, size_type Ssize = npos, size_type
pos = 0, size_type n = npos);

Remove Ssize.

> /** Assignment operation with printf(3)-style definition
> * \note arguments may be evaluated more than once, be careful
> * of side-effects
> */
> SBuf& Printf(const char *fmt, ...);

Add "Avoid this and other ... methods. For transition only."?
We should be using ostream formatting instead.

> /** print a SBuf. generally only used by operator <<
> */
> std::ostream& print(std::ostream &os) const;

Remove "generally only used by operator <<".

> /** random-access read to any char within the SBuf
> *
> * \throw OutOfBoundsException when access is out of bounds
> */
> _SQUID_INLINE_ const char operator [](size_type pos) const;

This one should not [promise to] check anything or throw. It should be
very fast instead. See std::string.

> /** random-access read to any char within the SBuf.
> *
> * \throw OutOfBoundsException when access is out of bounds
> */
> _SQUID_INLINE_ const char at (size_type pos) const;

This one should indeed check and throw.

> /** direct-access set a byte at a specified operation.
> * \param pos the position to be overwritten
> * \param toset the value to be written
> * \throw OutOfBoundsException when pos is of bounds
> * \note performs a copy-on-write if needed.
> */
> void setAt(size_type pos, char toset);

Perhaps document whether setting the next-to-last character is allowed?
Sometimes string APIs allow for that...

> * \retval 0 argument an this are equal

s/an/and/
It would be better to avoid bare "this" in the above comments. It is
awkward to read.

> * \retval true argument is a prefix of the SBuf
> * \retval false the SBuf is shorter than the argument, or it
doesn't match

Do not document the false values because it is totally redundant and you
often get it slightly wrong.

> /** Consume bytes at the head of the SBuf
> *
> * MemBuf emulation. Consume N chars at SBuf head, or to SBuf's end,
> * whichever is shorter. If more bytes are consumed than available,
> * the SBuf is emptied

.

> * \param howmuch how many bytes to remove.

Append "Could be zero."
s/howmuch/howMuch/ or, better, s/howmuch/n/ for consistency

> /** gets global statistic inforations

spelling

> static const SBufStats& getStats();

Capitalize the first letter of static methods.

> /** Null-terminate the SBuf
> *
> * Make sure that the first byte AFTER the SBuf is a NULL.
> * Doesn't alter the SBuf contents, may trigger a backing store
reloaction
> */
> void terminate();

Remove. Too dangerous if the effect is not persistent. You do not
implement persistent termination, and we probably do not need a
persistent termination anyway.

> /** Import a c-style string into the memory-managed world
> *
> * The char* MUST have been dynamically allocated via new,

c-strings are usually allocated using new[] rather than new.

> * and once imported MUST NOT be freed or messed with by the caller.
> * It will be freed via delete() by the memory management subsystem

and so they should be freed using delete[] rather than delete.

Moreover, your implementation is using xfree() and not delete[]!

And this is inconsistent with xfree()-based API below. Pick one API and
stick with it. Do we really need this error-prone method, BTW?

> * at the appropriate moment
> * (what "the appropriate moment" means is UNDEFINED to the caller).
> *
> * \param len the length of the String. If it's npos or
unspecified, then
> * cstr must be NULL-terminated.
> */
> SBuf& absorb(char * cstr, size_type len=npos);

.

> /** Export the contents of the SBuf to a C-string
> *
> * exports the SBuf by copying the contents to a newly-allocated
c string.
> * Freeing the returned char* is up to the caller, who MUST
xfree() it.
> * The returned string's contents are NOT guarranteed not to
contain \0.
> * The returned string guarranteed to be NULL-terminated.
> */
> char* exportCopy() const;

Add "Avoid. For transition only." ?

> /** Copy SBuf contents into user-supplied C buffer.
> *
> * Export a copy of the SBuf's contents into the user-supplied
> * buffer, up to the user-supplied-length.
> * The exported buffer is guarranteed to have a terminating \0, but
> * may truncate contents if the SBuf is too big.
> * \return 0 if all is OK
> * \return num the number of actually-copied chars, INCLUDING the \0
> * \note uses xstrncpy as low-level function
> */
> size_type copy(char *buf, size_type buflen) const;

Do we really need to zero-terminate the above? This low-level method may
be handy for I/O which does not require zero-termination. We should
minimize the number of APIs that terminate, IMO.

> /** exports a reference to the SBuf internal storage.
> * \warning THIS CALL IS DANGEROUS!

The call itself is not dangerous. Using raw internal storage is.

> * The caller MUST hold a reference to the SBuf while
> * accessing the storage, and MUST NOT attempt any operation on
the SBuf
> * while holding an external reference to the data, or she might
invalidate
> * the memory reference.

The above is vague and too restrictive. Use this, based on
std::string::data() description, instead:

Returns a pointer to buffered content. No terminating null character is
appended (use c_str() for that). The returned value points to an
internal location which contents are guaranteed to remain unchanged only
until the next call to a non-constant member function of the string
object. Such a call may be implicit (e.g., when SBuf is destroyed upon
leaving the current context).

Document whether this method can ever return NULL.

> * This is a very UNSAFE way of accessing the data.
> * The caller better be sure about what she's doing.
> * The returned string is NOT NULL-terminated. For that, use
exportTRef.
> * \see exportTRef
> * \note the memory management system guarrantees that the
exported region
> * of memory will remain valid if the caller keeps holding
> * a valid reference to the SBuf object and does not write or
append to
> * it. For example:
> * \code
> * SBuf foo("some string");
> * const char *bar=foo.exportRawRef();
> * doSomething(bar); //safe
> * foo.append(" other string");
> * doSomething(bar); //unsafe
> * \endcode
> */

I think the above should go into SBuf.dox

> char* exportRawRef() const;

I do not like the Ref() suffixes. You are not really exporting a
reference to internal storage. You are just providing a raw pointer to
it. Call this "raw()" or, following std::string convention, "data()".

> /** exports a null-terminated reference to the SBuf internal storage.
> * \warning THIS CALL IS DANGEROUS!
> *
> * The caller MUST hold a reference to the SBuf while
> * accessing the storage, and MUST NOT attempt any operation on
the SBuf
> * while holding an external reference to the data, or she might
invalidate
> * the memory reference.
> * This is a very UNSAFE way of accessing the data.
> * The caller better be sure about what she's doing.
> * \see exportRawRef
> * \note the memory management system guarrantees that the
exported region
> * of memory will remain valid if the caller keeps holding
> * a valid reference to the SBuf object and does not write or
append to
> * it
> */
> char* exportTRef();

See above. Similar comments apply here.

Let's call this c_str(), following std::string convention.

You may want to say explicitly that our c_str() never returns NULL.

> /** Get the length of the SBuf

Replace with: Returns the number of bytes stored in SBuf.

> * A SBuf is GUARRANTEED to be defined. It can be empty (length()==0)

Remove.

> */
> _SQUID_INLINE_ size_type length() const;

.

> /** Get the length of the SBuf, as a signed integer
> *
> * Compatibility function for printf(3) which requires a signed int
> * \throw SBufTooBigException if the SBuf is too big for a signed
integer

Kind of wrong exception, but not a big deal since it will have filename
and line number.

> /** Request to extend the SBuf's available store space.
> *
> * After the grow request, the SBuf is guarranteed to have at
least minsize
> * bytes of total backing store. If it has already more, do nothing.

Remove "If it has already more, do nothing." to leave space for future
optimizations. The caller should not really care about this anyway.

> * \throw SBufTooBigException if the user tres to allocate too
big a SBuf
> */
> void grow(size_type minsize);

.

> /** slicing method
> *
> * extracts a SBuf containing the portion from <i>from</i> to
<i>to</i>,
> * intersected with the actual SBuf length.

Awkward description, especially because we do not extract but erase.
Consider: Removes SBuf prefix and suffix, leaving a sequence of bytes
from <i>from</i> to <i>to</i>.

However, I think we should follow std::string lead here and change the
parameters to be "pos" and "n". I do not know why they picked those, but
let's be consistent with them and with our other methods. They also do
not clash with "to" and "from" words in the description :-). I would
also rename this to leave(), which will help you to check that all
callers use the right params after the parameters change :-).

> * If to is less or equal to from, an empty SBuf is returned.

Missing "than": If to is less than or equal to from, an empty SBuf is
returned.

> * \param from start substringing from this byte
> * \param to end of substring. If longer than the SBuf
> * it will be limited to its size.
> * SBuf::npos means "to end of SBuf"
> * \throw OutOfBoundsException when from exceeds the size of the SBuf
> * \todo: possible optimization: if the string is at the end of the
> * MemBlob we can decrease the MemBlob tail-pointer so that a
subsequent
> * append will reuse the freed space.
> */
> SBuf& chop(size_type from, size_type to=npos);

The \todo comment does not belong here. It belongs to .cc.

> /** Remove characters in the toremove set at the beginning, end
or both
> *
> * \param toremove characters to be removed. Stops chomping at
the first
> * found char not in the set
> * \param atBeginning if true (default), strips at the beginning
of the SBuf
> * \param atEnd if true (default), strips at the end of the SBuf
> */
> SBuf& chomp(const SBuf &toremove, bool atBeginning=true, bool
atEnd=true);

Consider renaming to trim() to avoid visual clashes with chop() and a
clash with Perl chomp that does not trim the prefix.

> MemBlob::Pointer store_; /// handle to the MemBlob holding the
memory block alive

Consider: ///< memory block, possibly shared with other SBufs

> size_type off_; /// offset of this string from store_->mem

Consider: ///< our content start offset from the beginning of the
possibly shared store_

> size_type len_; /// length of the SBuf

Consider: ///< number of our content bytes in the possibly shared store_

> u_int16_t nreallocs; /// number of reallocs this SBuf has been
subject to

Since nreallocs is copied on assignment, it's true meaning is rather
difficult to define. However, it is _not_ the number of reAlloc() calls
that _this_ SBuf has seen.

> static SBufStats stats; /// class-wide statistics
> static const size_type maxSBufSize = 0xffffff; // tuneable

Missing description. What is it? How is it different from maxSize
defined earlier?

> static const size_type malloc_overhead = 8; //tuneable.

Missing description. What is it?

> static MemBlob::Pointer storePrototype;
> _SQUID_INLINE_ static MemBlob::Pointer getStorePrototype();
//only used by anonymous constructor

Static thing names should start with a capital letter.

> _SQUID_INLINE_ char * buf() const;
> _SQUID_INLINE_ char * bufEnd() const;

Why const if they allow SBuf modification?

> _SQUID_INLINE_ size_type estimateCapacity(size_type desired);

Should be const if not static.

I noticed that you print "this" and similar raw pointers in debugging
messages. In my experience, those quickly become unusable in non-trivial
logs because when old objects are deleted, the new ones reuse their
pointers. Consider adding unique SBuf identifiers instead. We can always
remove them later if the overhead of extra 32 or 64 bits becomes too
great after the transition is over.

For SBuf.cci:

> #include <limits.h>

Should we use <climits> if available?

> SBuf& SBuf::operator = (const SBuf & S)
> {
> //stats set in called method
> return assign(S);
> }
>

Consider removing the comment. It is obvious that this method does
nothing but forwarding the call.

> int SBuf::plength() const

Do we need to place the return type on its own line in .cci?

> /**
> * returns the pointer to the first invalid char after this SBuf
> */

Consider: returns the pointer to the first char after this SBuf end

> /**
> * Try to guesstimate how big a MemBlob to allocate.
> * The result is guarranteed to be bigger than the desired
> * size.

s/to be bigger than the/to be at least the/

> * \todo improve heuristics

The \todo comment belongs to the implementation below, not description.

> SBuf::size_type SBuf::estimateCapacity (SBuf::size_type desired)
> {
> size_type given;
>
> given=(desired*12)/10;

Consider: const size_type given = (desired*12)/10;

IIRC, 100% is a typical growth factor. If your 20% is not based on
something specific, let's use 100% instead:

     const size_type given = desired*2;

We are counting on the MemBlock to round that up, right? Perhaps you
should add a comment about that.

> /**
> * Checks common to all string comparison functions, dealing with
undefined strings
> * \retval 0 check returns equal
> * \retval -1 check returns -1
> * \retval 1 check returns 1
> * \retval 2 check is not conclusive
> */
> int SBuf::commonCompareChecksPre (const SBuf &S) const
> {
> return 2;
> }

Remove until needed? The return value descriptions are confusing, IMO.

> MemBlob::Pointer SBuf::getStorePrototype()
> {
> if (storePrototype==0)
> storePrototype = new MemBlob((char*)"",0);
> return storePrototype;
> }

Please document and re-implement without const cast of "".

>
> bool
> SBuf::isEmpty() const
> {
> return (len_==0);
> }

Consider: return !len_;

For SBuf.cc:

> SBuf::SBuf(const String &S)
> : off_(0),nreallocs(0)
> {
> debugs(SBUF_DEBUGSECTION,DBG_DATA,
> MYNAME << "this=" <<(void *)this <<
> " copyfrom="<<(void *)&S);
> store_=new MemBlob(S.size());
> store_->append(S.rawBuf(),S.size());
> len_=S.size();
> ++stats.alloc;
> ++stats.sset;
> ++stats.live;
> }

Use assign() or, better, some common denominator to implement the guts
of this constructor. The above implementation is already slightly
inconsistent with the other similar ones. Its only going to get worse.

> SBuf::SBuf(const char *S, size_type Ssize, size_type pos, size_type n)
> : off_(0), nreallocs(0) //len_ and store_ defined inside the call
> {
> //very similar to SBuf::append(const char*...). Can be refactored?

Same as above.

> Must(pos==npos || pos >= 0);
> Must(n==npos || n >= 0);
>
> if (S==NULL) //to prevent using strlen later. Appending will also
be a nop
> Ssize=0;
> if (Ssize==npos)
> Ssize=strlen(S);
> if (n==npos)
> n=Ssize-pos;
> if (n > Ssize-pos)
> n=Ssize-pos;

What if pos > Ssize?

>
> const char *actual_start=S+pos;

If S is NULL, actual_start will point to nowhere!

I suspect you are also missing some other checks, but I did not verify
because Ssize is in the way.

> debugs(SBUF_DEBUGSECTION,DBG_DATA,
> MYNAME << "this=" << (void *)this << ", S=" << (void*) S <<
> ", Ssize=" << Ssize << ", pos=" << pos << ", n=" << n);
> store_=new MemBlob(estimateCapacity(n));
> store_->append(actual_start,n);
> len_=n;
> ++stats.alloc;
> ++stats.sset;
> ++stats.live;
> }

.

> SBuf& SBuf::assign(const SBuf &S)
> {
> debugs(SBUF_DEBUGSECTION,7,
> HERE << ", this@"<<(void *)this
> << ", other=" << ", other@"<< (void *)&S );
> ++stats.qset;
> if (&S==this) //assignment to self. Noop.
> return *this;

"Noop" after a member change looks wrong. Remove the "Noop" part of the
comment?

> void SBuf::grow(size_type minsize)
> {
> debugs(SBUF_DEBUGSECTION, 7,
> MYNAME << "this="<<(void *)this <<
> ",target=" << minsize);
> if (minsize > maxSize)
> throw SBufTooBigException(__FILE__,__LINE__);
> if (minsize < length()) { //nop. We're already big enough.

The length() is irrelevant here. You need to check available space size,
not allocated content size. More on the grow() checks below.

> debugs(SBUF_DEBUGSECTION, 7, HERE << "not growing");
> }
> reAlloc(estimateCapacity(minsize));
> }

Optimization: We should probably memmove() before we reAlloc() if it
helps to satisfy minsize.

> void SBuf::clear()
> {
> #if 0
> //enabling this code path, the store will be freed and reinitialized
> store_=getStorePrototype(); //uncomment to actually free storage
upon clear()
> #else
> //enabling this code path, we try to release the store without
deallocating it.
> // will be lazily reallocated if needed.
> if (store_->RefCountCount() == 1)
> store_->bufUsed=0;

Please remove this bufUsed optimization. It appears to be broken (you
increment bufUsed inconsistently), it is complex, and we can always add
it later.

> SBuf& SBuf::append(const SBuf &S)
> {
> debugs(SBUF_DEBUGSECTION,7,
> MYNAME << "this=" << (void *)this <<
> "appending=" <<(void *)&S<<")");
> if (!store_->canAppend(bufEnd(),S.length())) {
> grow(length()+S.length());
> }

Please remove all such canAppend() guards. Just call grow(). It is much
safer than risking to forget a guard. Your grow() should not reAlloc if
growth is not needed. You may rename grow() to assureSpace() or similar
if you prefer but do not split the growth need check and the realloc
call in the "user" code.

> if (S.length()==0) { //appending a null sbuf.
> ++stats.qset;
> return *this;
> }

Remove (optimizing unusual cases is harmful to the common ones) or at
least move it above the grow() call.

> debugs(SBUF_DEBUGSECTION,8,
> HERE << "appending");
> store_->append(S.buf(),S.length());
> len_+=S.length();

This is missing cow(). And there are many similar bugs elsewhere. The
right fix, IMO, is to hide store_ and provide two inlined protected
methods similar to these:

MemBlob &store() { cow(); return *store_; } /// memory access for writing
const MemBlob &store() const { return *store_; } /// read-only memory access

This way, you do not need to remember or think about when to call cow(),
which could be tricky at times. The beginning of cow() can be inlined to
make initial "do we need to copy?" check very cheap.

> SBuf& SBuf::append(const std::string &s, SBuf::size_type pos,
SBuf::size_type n)
> {

Use append(char*) or similar to implement this and other std::string
methods.

> std::ostream& SBuf::dump(std::ostream &os) const
> {
> os << "SBuf@" << (void*) this
> << ": memhandle:" << (void *)store_.getRaw()
> << "{sz:" << store_->bufSize
> << ",used:" << store_->bufUsed
> << ",refs: " << store_->RefCountCount() << "}"

MemBlock should dump its own details instead.

> int SBuf::compare(const SBuf &S, bool case_sensitive) const
> {

I think we should add an "n" parameter to limit the comparison scope to
the first n characters. This will allow for much cheaper implementation
of startsWith() and may be useful elsewhere as well. You are computing a
similar parametr below anyway.

> SBuf SBuf::consume(SBuf::size_type howmuch)

s/howmuch/n/ for consistency with other methods.

> {

Start this method with:
       const SBuf rv(substr(0,howmuch));

Do we really need to return consumed content?

> if (howmuch==npos)
> howmuch=length();
> const size_type actual=min(howmuch,length());
> SBuf rv(substr(0,actual));
> off_+=actual;
> len_-=actual;
> if (length()==0) {//we can free the store_
> clear();
> return rv;
> }
> ++stats.qset;
> return rv;
> }

Please implement the code above using chop().

> SBuf& SBuf::chop(SBuf::size_type from, SBuf::size_type to)
> {
> checkAccessBounds(from); //throws if out-of-bounds
> if (to==npos)
> to=length();

Missing "else"

> if (to > length())
> to=length();
> ++stats.qset;
> if (to <= from) {
> clear();
> return *this; //stats handled by callee

Remove comment because there are some stats increments above it.

> SBuf& SBuf::chomp(const SBuf &toremove, bool atBeginning, bool atEnd)
> {
> ++stats.qset;
> char *p;

Please declare "const" if possible and move inside each if-statement
below, combining definition and initialization.

> if (atEnd) {
> p=bufEnd()-1;
> while (!isEmpty() &&
NULL!=memchr(toremove.buf(),*p,toremove.length())) {

Yoda comparison style please avoid. At the very least calling a function
when. No safer it is, but read difficult to. Please.

> //current end-of-buf is in the searched set
> chop(0,length()-1); //remove last char and iterate

Can the above expensive call be replaced with something like "--len_"?
Even if the final replacement would have to be a litle more complex, we
should probably still do it because calling chop() for every character
is just too much overhead.

> --p;
> }
> }
> if (atBeginning) {
> p=buf();
> while (!isEmpty() &&
NULL!=memchr(toremove.buf(),*p,toremove.length())) {
> chop(1);

Same here. ++off_?

> SBuf::size_type SBuf::find(char c, SBuf::size_type startpos) const
> {
> if (startpos==npos)
> startpos=length();
> startpos=min(startpos,length());

Missing "else" before the last line.

> /**
> * checks whether the requested 'pos' is within the bounds of the SBuf
> * \throw OutOfBoundsException if access is out of bounds
> * \throw TextException if the SBuf is null

I do not see TextException being thrown below.

> */
> void SBuf::checkAccessBounds(SBuf::size_type pos) const
> {
> if (pos < 0)
> throw OutOfBoundsException(*this,pos,__FILE__,__LINE__);
> if ((size_t)pos > (size_t)length()) //hack. gcc complains
otherwise...

I do not understand why this is needed because pos and length() are the
same type. What did GCC complain about?

Do we allow pos == length()?

> throw OutOfBoundsException(*this,pos,__FILE__,__LINE__);
> }

.

> String SBuf::toString() const
> {
> String rv;
> rv.limitInit(exportRawRef(),length());

Can we just call buf() here instead of the scarry exportRawRef()?

Thank you,

Alex.

> src/SBufList.*: a class implementing a list of SBufs
> src/SBufTokenizer.*: a SBuf-based tokenizer, meant to implement one-pass parsers
> src/SBufExtras.*: stuff meant to integrate SBuf with Squid
> (cachemanager) and a place to put various helping stuff
> src/SBufUtil.*: SBuf-based utilities (so far conversion functions to
> and from SBufList, and a BaseName)
> Everything else is caller adaption.
>
> Thanks to everyone who'll take the time to check this out.
>
Received on Fri Sep 17 2010 - 11:26:17 MDT

This archive was generated by hypermail 2.2.0 : Tue Sep 21 2010 - 12:00:06 MDT