adjust the contract on string->url so that it actually catches all of

the errors that would be signalled by the body. also, remove
url-regexp from the exports (it was only recently added)

I believe this eliminates two of Eli's concerns:

  - the contract is no longer so painful to read

  - the performance is more reasonable.

Specifically, for the performance, here are the times I see to call
string->url on "http://www.racket-lang.org":

no contract: any/c
cpu time: 564 real time: 566 gc time: 3

weak contract: (-> (or/c string? bytes?) url?)
cpu time: 590 real time: 590 gc time: 3

strong, regexp-based contract:
(-> (or/c (not/c #rx"^([^:/?#]*):") #rx"^[a-zA-Z][a-zA-Z0-9+.-]*:") url?)
cpu time: 632 real time: 633 gc time: 5

This appears to be about a 10% slowdown for the regexp-based contract
over the weaker contract.

related to PR 12652

original commit: 86572cc8c3
This commit is contained in:
Robby Findler 2012-03-29 17:22:49 -05:00
parent 1b243ce46b
commit 51cf8696b3

View File

@ -1,6 +1,5 @@
#lang scribble/doc
@(require "common.rkt" scribble/bnf
(only-in net/url url-regexp)
(for-label net/url net/url-unit net/url-sig
net/head net/uri-codec net/tcp-sig
(only-in net/url-connect current-https-protocol)
@ -96,7 +95,9 @@ An HTTP connection is created as a @deftech{pure port} or a
have been removed, so that what remains is purely the first content
fragment. An impure port is one that still has its MIME headers.
@defproc[(string->url [str (and/c (or/c string? bytes?) url-regexp)]) url?]{
@defproc[(string->url [str (or/c (not/c #rx"^([^:/?#]*):")
#rx"^[a-zA-Z][a-zA-Z0-9+.-]*:")])
url?]{
Parses the URL specified by @racket[str] into a @racket[url]
struct. The @racket[string->url] procedure uses
@ -104,6 +105,10 @@ struct. The @racket[string->url] procedure uses
sensitive to the @racket[current-alist-separator-mode] parameter for
determining the association separator.
The contract on @racket[str] insists that, if the url has a scheme,
then the scheme begins with a letter and consists only of letters,
numbers, @litchar{+}, @litchar{-}, and @litchar{.} characters.
If @racket[str] starts with @racket["file:"], then the path is always
parsed as an absolute path, and the parsing details depend on
@racket[file-url-path-convention-type]:
@ -123,17 +128,6 @@ parsed as an absolute path, and the parsing details depend on
]}
@defthing[url-regexp regexp?]{
This is a regular expression based on the one in
Appendix B of RFC 3986 for recognizing urls.
This is the precise regexp:
@centered{@tt{@(object-name url-regexp)}}
}
@defproc[(combine-url/relative [base url?] [relative string?]) url?]{
Given a base URL and a relative path, combines the two and returns a