whalesong/NOTES
2011-06-07 16:48:56 -04:00

739 lines
20 KiB
Plaintext

Some possible optimizations with application:
If any of the operands are constant (either by being variable
lookups or literal constants), and if all of them are side-effect
free, then juggle-operands might not be necessary. I think this
is similar to the "reorder" optimization described in casey's
paper.
In a self-application, it's not necessary to compute the operator,
since the value is in the top control frame. A parameterization
can maintain the current lam in the top of the control frame.
Given that, then there's no need to juggle operands either, since
we can grab the operator afterwards and put it in place.
For a kernel primitive call, if all of the operands are all
constant, stack references, or kernel primitive calls, then
there's no need to push for fresh stack space.
----------------------------------------------------------------------
Multiple values
There's interplay between compile-proc-appl and the linkage compiling
functions compile-linkage and compile-application-linkage. When we
deal with multiple values, we'll have to do something here to make the
values efficient. There's a paper by J. Michael Ashley and R. Kent
Dybvig called "An Efficient Implementation of Multiple Return Values
in Scheme" that I'll need to read.
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.39.1668&rep=rep1&type=pdf
Basic idea: each return address is actually a pair, where the
secondary address lies at a fixed offset of the first and handles
multiple value return. Multiple values are returned back by keeping
them on the stack, and assigning argcount to the number of the
returned values.
In the context of my compiler: the compiler implicitly defines a
singleton, statement context by using next-linkage. But some uses of
next-linkage ignore the number of values that come back, and others
should raise an error. Here are the contexts that care:
app
let1
install-value
toplevel-set (define-values, assign)
For the contexts that don't care, we need to set up a return address
that just pops those values off.
Before introducing the multiple-value jumps
(172b1d9e5de823b53a6705fc87babfdd61152924), test-conform-browser
reports the following times:
fermi ~/work/whalesong $ racket test-conform-browser.rkt
running test... ok (5248 milliseconds)
fermi ~/work/whalesong $ racket test-conform-browser.rkt
running test... ok (5478 milliseconds)
fermi ~/work/whalesong $ racket test-conform-browser.rkt
running test... ok (5501 milliseconds)
fermi ~/work/whalesong $ racket test-conform-browser.rkt
running test... ok (5853 milliseconds)
fermi ~/work/whalesong $ racket test-conform-browser.rkt
running test... ok (5532 milliseconds)
fermi ~/work/whalesong $ racket test-conform-browser.rkt
running test... ok (5498 milliseconds)
fermi ~/work/whalesong $ racket test-conform-browser.rkt
running test... ok (5351 milliseconds)
fermi ~/work/whalesong $ racket test-conform-browser.rkt
running test... ok (5464 milliseconds)
fermi ~/work/whalesong $ racket test-conform-browser.rkt
running test... ok (5545 milliseconds)
fermi ~/work/whalesong $ racket test-conform-browser.rkt
running test... ok (5405 milliseconds)
After introducing the mutiple value jumps targets
(cc1c156df79bab09ca37164e75ae0afe0ac1b0d0), test-conform-browser is
reporting the following times:
running test... ok (5281 milliseconds)
fermi ~/work/whalesong $ racket test-conform-browser.rkt
running test... ok (5554 milliseconds)
fermi ~/work/whalesong $ racket test-conform-browser.rkt
running test... ok (5588 milliseconds)
fermi ~/work/whalesong $ racket test-conform-browser.rkt
running test... ok (5509 milliseconds)
fermi ~/work/whalesong $ racket test-conform-browser.rkt
running test... ok (5428 milliseconds)
fermi ~/work/whalesong $ racket test-conform-browser.rkt
running test... ok (5387 milliseconds)
fermi ~/work/whalesong $ racket test-conform-browser.rkt
running test... ok (5539 milliseconds)
fermi ~/work/whalesong $ racket test-conform-browser.rkt
running test... ok (5355 milliseconds)
fermi ~/work/whalesong $ racket test-conform-browser.rkt
running test... ok (5551 milliseconds)
fermi ~/work/whalesong $ racket test-conform-browser.rkt
running test... ok (5331 milliseconds)
At a rough glance, I see no appreciable extra cost for this program,
since it doesn't use multiple-value-return. Thankfully, it looks like
the JIT in JavaScript isn't significantly hurt when we set the
attribute to the procedure.
What's left to do:
forms for using the values coming from multiple value returns
(with-values, define-values, let-values)
runtime error traps for contexts that must not receive multiple values.
fixing apply definition so it doesn't return multiple values when
given a single argument.
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\
----------------------------------------------------------------------
Open coding:
I want to be able to write the definitions of kernel primitives once,
and reuse those definitions for both the open-coding as well as the
real runtime. I also need to be able to encode the type checks. I
want to be able to say:
(make-kernel-primitive '+
(arity 0 #t)
(lambda (args)
(values (mapi (lambda (arg i)
(test arg i number?))
arg)
(string-join args "+"))))
and have it magically generate the definitions for the open-coding
primitive as well as:
PRIMITIVES["+"] = function(MACHINE, arity) {
var result = 0;
for (var i = 0 ; i < arity; i++) {
test(isNumber(MACHINE.env[MACHINE.env.length - 1 - i]),
i,
"number");
result += MACHINE.env[MACHINE.env.length - 1 - i];
}
return result;
};
Is this completely unrealistic? I have to see how Rabbit and Orbit do this.
----------------------------------------------------------------------
Runtime values and types are in in the plt.runtime namespace. I need
to move types from WeScheme into here.
----------------------------------------------------------------------
Frames and environments.
A CallFrame consists of:
A return address back to the caller.
A procedure (the callee).
A stack.
A set of continuation marks.
A PromptFrame consists of:
A return address back to the caller.
A tag.
A set of continuation marks.
On exit from a CallFrame,
MACHINE.env = frame.env
On a regular, generic function call:
The operator and operands are computed and placed in MACHINE.env's
scratch space.
A new call frame is constructed. The frame remembers the environment.
The machine jumps into the procedure entry.
On a tail call,
The operator and operands are computed and placed in MACHINE.env's
scratch space.
The existing call frame is reused.
The frame's environment consumes those elements from MACHINE.env
MACHINE.env = the new stack segment
Optimizations with IL
The sequence PushEnvironment ... AssignImmediateStatement (EnvLexicalAddress ...)
where we're assigning directly to a spot we just allocated, can be reduced to
a single instruction.
We can do some constant folding in operands. e.g.
MACHINE.env[MACHINE.env.length - 1 - 3] = MACHINE.env[MACHINE.env.length - 1 - 7];
=>
MACHINE.env[MACHINE.env.length - 4] = MACHINE.env[MACHINE.env.length - 8];
On tail calls, when we're reusing all of the arguments on the stack,
there's no need to splice, since we won't be popping anything off:
MACHINE.env.splice(MACHINE.env.length - (MACHINE.argcount + ((10) - MACHINE.argcount)), ((10) - MACHINE.argcount));
is a no-op.
In the case where a closure has a prefix, but all the uses of the prefix are to open-coded primitives, then we don't need to close over it after all. e.g.
(test '(begin (letrec ([f (lambda (x) (* x x))]
[g (lambda (x) (* x x x))])
(- (g (f (+ (g 3) (f 3)))) 1)))
2176782335
#:debug? #t)
since (* -) are both open-coded, there's no need to capture the
prefix, and we can reduce some allocation.
I can eliminate the first instruction in the pair:
#(struct:AssignPrimOpStatement val #(struct:GetCompiledProcedureEntry))
#(struct:GotoStatement #(struct:Label lamEntry259))
since the val isn't even being used here... This is the case when we
statically know the lambda target.
- this is done now.
I can coalese
(PushEnvironment 1 #f)
(AssignPrimOpStatement (EnvLexicalReference 0 #f) (MakeCompiledProcedure 'lamEntry265 1 '(2 1) 'diff)
into a single statement.
If lambdas don't escape, then we can make their closures empty by
simply explicitly passing in the free arguments.
There's no good reason why the IL has both AssignImmediateStatement
and AssignPrimOpStatement. The distinction is artificial because I'm
allowing the RHS of assignments to use arbitrary expressions, since my
runtime (JavaScript) supports it. I should consolidate these
structures; it may allow me to remove a few more instructions (like
setting ControlLabel to 'val).
flush-output must immediately yield control to the browser, because
the browser needs control back to display changes to the dom.
Basically, we're simulating an IO interrupt here...
April 17, 2011
The dynamic recomputation for gas is only controlling one parameter:
how many times to run the trampoline before bouncing off to the
browser. But we really have two parameters that need dynamic
computation
* FN: the number of function calls before invoking the trampoline.
FN is necessarily bounded above by the browser. The larger it
is, the more efficient the trampoline can be.
* TI: the number of trampoline invokations before yielding to the browser
Both of these should be under some dynamic controller. We want to
optimize the efficiency of the runtime. I don't know what the
function is, but we want to optimize the parameters FN and TI such
that it maximizes FN and minimizes TI, and yet gives us the browser
reactivity we want.
April 24, 2011
The variables for linkage and target are doing double duty, which is
showing up in the defintion for compilation, since there are cases
that shouldn't exist in there.
They really should be part of the same datatype which describes,
essentially, what the code's continuation should be doing next.
Target's describing where the value needs to be installed at the end
of this, and linkage describes how to jump into the continuation.
Return --- write value to val, pop off and jump according to
dynamic value on control context. Return context may be in tail
position or not.
Next --- write value to a particular target and continue on.
Label --- write value to a particular target and jump
unconditionally to labeled location.
The continuation may or may not be expecting multiple values.
Ignore: doesn't care how many values come back. Throw away values
if multiple values are passed in.
Any: receives multiple values, and ensures those values are on the
stack. If a single value is received, pushes it on the stack and
sets up argcount to 1.
N: must receive exactly N values. If there's a mismatch, raises a
runtime error.
Return will allow Any number of values to come back. It doesn't need
a separate multiple-value context.
Next expects either exactly 1 value to come back, or ignores. So it
needs an multiple-value context.
Label, too, expects exactly 1 value to come back, or ignores. So it
needs a mulitple-value context.
When we use apply-values, it'll compile the producer expression in an
Any context.
I'm going to simplify values a bit.
----------------------------------------------------------------------
April 28
Multiple values are handled in the following way now.
In a context that expects multiple values to be returned,
if n = 0, don't leave anything on the stack before jumping out
if n = 1, put the single value in the 'val register
if n > 1, put the first value in the 'val register, and leave the
rest (the n-1 values) on the value stack.
The context is then responsible for dealing with those multiple return
values.
The contexts are now of the following types:
'tail : keeps the values on the stack. Used specifically for tail return.
'drop-multiple : drops any extra values on the stack
'keep-multiple : keeps any number of values on the stack.
Natural : expects exactly n values. Errors out if this can't be the case.
There appears to be a bug in compile-splice regarding multiple value
contexts. I haven't yet fixed the bug. I need a test case. I need
to somehow create a splicing expression in the context of something
that expects multiple values back. I'm not exactly sure how to create
such a context.
Ok, I think I've been able to do this successfully. I lifted out the
code for emit-values-context-check-on-procedure-return so it's used
for both the returns from procedure call, as well as the calls from
the prompt splicing.
----------------------------------------------------------------------
May 13, 2011
begin0 is still broken; I'm a bit unsatisfied with the way that it's
coded, and I know it's not correct yet.
The other thing that's I'm sure needs to be looked at again are the
implementation of prompts.
modules are almost in place. Needs a mechanism for loading modules on
the fly on the network, as well as some kind of predicable namespacing
mechanism. I think the compiler will need to include something like a
(current-module-name-canonizer)
which takes module names (symbol, path-string) and systematically
translates them to predictable identifiers. Anything refering to a
collection should be translated to
collects/...
Anything outside that should be given a name relative to some root.
One should be able to say:
root the translation at "/home/dyoo/work/js-sicp-5.5/examples"
where all translated paths are either from collections, or reachable
from the root. That way, we get predictable paths.
js-sicp-5-5 is an uninspired name for the project. I'm renaming it to
"Whalesong". Whale, because it's related to the Moby-Scheme project,
and song because, well, some songs can be called a "Racket". :)
I want to use Google Closure's compiler to minify. Closure doesn't
like some of the code, so I'll need to rename some of the identifiers
in types.js to avoid colliding with Java keywords (char, float).
----------------------------------------------------------------------
I need to be careful about targetting the right target in recursive calls to compile.
The bug I was chasing down involved using (compile subexpr target cenv), where the
target was that of the parent expression. But this is an invalid thing to do.
The other bug I nailed down yesterday was the shared-gensym bug with
lambda entry points. I have a single module now that distributes
gensyms for the lambda entry points. At some point, I'd like to make
the names descriptive so it's easier to figure out the functions by
low-level inspection.
Optimizations are off at the moment until I integrate this into Moby.
Then I can start turning optimizations back on.
----------------------------------------------------------------------
May 20, 2011
I'm running my bytecode parser over the entire racket collects tree,
just to make sure the parser itself is robust.
Parsing takes milliseconds, except on Typed Racket code, which is expected.
----------------------------------------------------------------------
May 23, 2011
Let me list out, roughly, what's left for me to do do:
Get module invokation working.
Get enough of the racket/base helper functions working to get
basic programs in place.
Get the runtime of Moby in the system.
Integrate the raw JavaScript-specific extensions in.
Isolate performance issues.
I've isolated exactly what primitives I need to get racket/base up and
running. It looks like I need 231 of them. That's not that much,
actually. experiments/primitives-for-racket-base describes which ones
we need.
----------------------------------------------------------------------
May 25, 2011
About to make modules work. Need to make sure exports can rename names.
----------------------------------------------------------------------
What's currently preventing racket/base?
Nan, INF Numbers, Regular expressions, keywords, byte strings,
character literals
Missing #%paramz module
----------------------------------------------------------------------
What needs to be done next?
benchmarks
being able to write modules in javascript
being able to bundle external resources (like images)
----------------------------------------------------------------------
June 2, 2011
We need a mechanism for attaching external resources to a program and
to be able to refer to them. External resources refer to things like:
Images
HTML files
Sounds
Each of these should be findable by path, and they should also be
packaged when a program is built. The user should be able to say
something like this:
(define my-html-file (local-resource-path ...))
and the packager should automatically be able to statically walk
though all local-resource-path calls.
These files must be represented as separate files.
There may be additional resources that need to be included, though
they aren't directly used in the program. (Say, for example, an image
is refered to in an html resource. Maybe we should have a toplevel
element (include-resources ...) that take in a bunch of
build-resource-path.
(include-resources (build-resource-path ...)
...)
The Whalesong make system needs to know how to deal with resources.
I also want to be able to refer to external web resources, and somehow
capture or cache them during building.
Imaginary: I would like to be able to write something like this:
(define my-file (remote-resource-path ...))
and be able to treat my-file as if it were a local resource. During
packaging, the file should be directly downloaded.
Other things to consider:
We may need to consider workarounds for same-origin policy
restrictions.
Can resources be input-port sources?
----------------------------------------------------------------------
I've at least imported the libraries from Moby so that they're loaded
in the same runtime. However, there are some ugly things happening
here: for some reason, there's a circular dependency between types and
helpers. That dependency needs to be cut! I've added a workaround
for now, using the link library to delay initialization until the
modules are present, but this is an unsatisfactory solution. We
really shouldn't get into this problem in the first place... I must
do a lot of code cleanup once things stabilize...
----------------------------------------------------------------------
June 3, 2011
How do I test what I'm integrating? I should at least absorb the
js-vm test suite, at the very least.
11am: I'm going to spend the next hour trying to disentangle the
circularity between helpers and types.
The parameters I'm using to control bounce are too high for Firefox,
leading it to raise the dialog about an out of control jva process. Not good.
----------------------------------------------------------------------
Working out the type mappings between values in Racket and values in JavaScript
Racket JavaScript Switched over?
number jsnums.SchemeNumber yes
immutable strings JavaScript string
mutable strings types.Str
vector
regular-expressions
path
bytes
box
placeholder
character
symbol
keyword
pair yes
empty yes
eq-hashtable
equal-hashtable
struct-type
struct
path
continuation-mark-set
primitive-procedure
closure
case-lambda
----------------------------------------------------------------------
I should add the functions:
get-viewport-width
get-viewport-height
Other notes from Shriram: too slow, inconsistent positioning from jsworld.