498 lines
14 KiB
Plaintext
498 lines
14 KiB
Plaintext
Some possible optimizations with application:
|
|
|
|
If any of the operands are constant (either by being variable
|
|
lookups or literal constants), and if all of them are side-effect
|
|
free, then juggle-operands might not be necessary. I think this
|
|
is similar to the "reorder" optimization described in casey's
|
|
paper.
|
|
|
|
In a self-application, it's not necessary to compute the operator,
|
|
since the value is in the top control frame. A parameterization
|
|
can maintain the current lam in the top of the control frame.
|
|
Given that, then there's no need to juggle operands either, since
|
|
we can grab the operator afterwards and put it in place.
|
|
|
|
For a kernel primitive call, if all of the operands are all
|
|
constant, stack references, or kernel primitive calls, then
|
|
there's no need to push for fresh stack space.
|
|
|
|
|
|
|
|
|
|
----------------------------------------------------------------------
|
|
|
|
|
|
Multiple values
|
|
|
|
There's interplay between compile-proc-appl and the linkage compiling
|
|
functions compile-linkage and compile-application-linkage. When we
|
|
deal with multiple values, we'll have to do something here to make the
|
|
values efficient. There's a paper by J. Michael Ashley and R. Kent
|
|
Dybvig called "An Efficient Implementation of Multiple Return Values
|
|
in Scheme" that I'll need to read.
|
|
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.39.1668&rep=rep1&type=pdf
|
|
|
|
|
|
Basic idea: each return address is actually a pair, where the
|
|
secondary address lies at a fixed offset of the first and handles
|
|
multiple value return. Multiple values are returned back by keeping
|
|
them on the stack, and assigning argcount to the number of the
|
|
returned values.
|
|
|
|
|
|
In the context of my compiler: the compiler implicitly defines a
|
|
singleton, statement context by using next-linkage. But some uses of
|
|
next-linkage ignore the number of values that come back, and others
|
|
should raise an error. Here are the contexts that care:
|
|
|
|
app
|
|
let1
|
|
install-value
|
|
toplevel-set (define-values, assign)
|
|
|
|
|
|
For the contexts that don't care, we need to set up a return address
|
|
that just pops those values off.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Before introducing the multiple-value jumps
|
|
(172b1d9e5de823b53a6705fc87babfdd61152924), test-conform-browser
|
|
reports the following times:
|
|
|
|
fermi ~/work/js-sicp-5-5 $ racket test-conform-browser.rkt
|
|
running test... ok (5248 milliseconds)
|
|
fermi ~/work/js-sicp-5-5 $ racket test-conform-browser.rkt
|
|
running test... ok (5478 milliseconds)
|
|
fermi ~/work/js-sicp-5-5 $ racket test-conform-browser.rkt
|
|
running test... ok (5501 milliseconds)
|
|
fermi ~/work/js-sicp-5-5 $ racket test-conform-browser.rkt
|
|
running test... ok (5853 milliseconds)
|
|
fermi ~/work/js-sicp-5-5 $ racket test-conform-browser.rkt
|
|
running test... ok (5532 milliseconds)
|
|
fermi ~/work/js-sicp-5-5 $ racket test-conform-browser.rkt
|
|
running test... ok (5498 milliseconds)
|
|
fermi ~/work/js-sicp-5-5 $ racket test-conform-browser.rkt
|
|
running test... ok (5351 milliseconds)
|
|
fermi ~/work/js-sicp-5-5 $ racket test-conform-browser.rkt
|
|
running test... ok (5464 milliseconds)
|
|
fermi ~/work/js-sicp-5-5 $ racket test-conform-browser.rkt
|
|
running test... ok (5545 milliseconds)
|
|
fermi ~/work/js-sicp-5-5 $ racket test-conform-browser.rkt
|
|
running test... ok (5405 milliseconds)
|
|
|
|
|
|
After introducing the mutiple value jumps targets
|
|
(cc1c156df79bab09ca37164e75ae0afe0ac1b0d0), test-conform-browser is
|
|
reporting the following times:
|
|
|
|
|
|
running test... ok (5281 milliseconds)
|
|
fermi ~/work/js-sicp-5-5 $ racket test-conform-browser.rkt
|
|
running test... ok (5554 milliseconds)
|
|
fermi ~/work/js-sicp-5-5 $ racket test-conform-browser.rkt
|
|
running test... ok (5588 milliseconds)
|
|
fermi ~/work/js-sicp-5-5 $ racket test-conform-browser.rkt
|
|
running test... ok (5509 milliseconds)
|
|
fermi ~/work/js-sicp-5-5 $ racket test-conform-browser.rkt
|
|
running test... ok (5428 milliseconds)
|
|
fermi ~/work/js-sicp-5-5 $ racket test-conform-browser.rkt
|
|
running test... ok (5387 milliseconds)
|
|
fermi ~/work/js-sicp-5-5 $ racket test-conform-browser.rkt
|
|
running test... ok (5539 milliseconds)
|
|
fermi ~/work/js-sicp-5-5 $ racket test-conform-browser.rkt
|
|
running test... ok (5355 milliseconds)
|
|
fermi ~/work/js-sicp-5-5 $ racket test-conform-browser.rkt
|
|
running test... ok (5551 milliseconds)
|
|
fermi ~/work/js-sicp-5-5 $ racket test-conform-browser.rkt
|
|
running test... ok (5331 milliseconds)
|
|
|
|
|
|
|
|
At a rough glance, I see no appreciable extra cost for this program,
|
|
since it doesn't use multiple-value-return. Thankfully, it looks like
|
|
the JIT in JavaScript isn't significantly hurt when we set the
|
|
attribute to the procedure.
|
|
|
|
|
|
|
|
|
|
What's left to do:
|
|
|
|
forms for using the values coming from multiple value returns
|
|
(with-values, define-values, let-values)
|
|
|
|
runtime error traps for contexts that must not receive multiple values.
|
|
|
|
fixing apply definition so it doesn't return multiple values when
|
|
given a single argument.
|
|
|
|
|
|
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
----------------------------------------------------------------------
|
|
|
|
|
|
|
|
|
|
Open coding:
|
|
|
|
I want to be able to write the definitions of kernel primitives once,
|
|
and reuse those definitions for both the open-coding as well as the
|
|
real runtime. I also need to be able to encode the type checks. I
|
|
want to be able to say:
|
|
|
|
|
|
(make-kernel-primitive '+
|
|
(arity 0 #t)
|
|
|
|
(lambda (args)
|
|
(values (mapi (lambda (arg i)
|
|
(test arg i number?))
|
|
arg)
|
|
(string-join args "+"))))
|
|
|
|
and have it magically generate the definitions for the open-coding
|
|
primitive as well as:
|
|
|
|
PRIMITIVES["+"] = function(MACHINE, arity) {
|
|
var result = 0;
|
|
for (var i = 0 ; i < arity; i++) {
|
|
test(isNumber(MACHINE.env[MACHINE.env.length - 1 - i]),
|
|
i,
|
|
"number");
|
|
result += MACHINE.env[MACHINE.env.length - 1 - i];
|
|
}
|
|
return result;
|
|
};
|
|
|
|
Is this completely unrealistic? I have to see how Rabbit and Orbit do this.
|
|
|
|
|
|
|
|
|
|
----------------------------------------------------------------------
|
|
|
|
Runtime values and types are in in the plt.runtime namespace. I need
|
|
to move types from WeScheme into here.
|
|
|
|
|
|
----------------------------------------------------------------------
|
|
|
|
|
|
Frames and environments.
|
|
|
|
|
|
A CallFrame consists of:
|
|
|
|
A return address back to the caller.
|
|
A procedure (the callee).
|
|
A stack.
|
|
A set of continuation marks.
|
|
|
|
|
|
A PromptFrame consists of:
|
|
|
|
A return address back to the caller.
|
|
A tag.
|
|
A set of continuation marks.
|
|
|
|
|
|
|
|
On exit from a CallFrame,
|
|
|
|
MACHINE.env = frame.env
|
|
|
|
|
|
|
|
On a regular, generic function call:
|
|
|
|
The operator and operands are computed and placed in MACHINE.env's
|
|
scratch space.
|
|
|
|
A new call frame is constructed. The frame remembers the environment.
|
|
|
|
The machine jumps into the procedure entry.
|
|
|
|
|
|
On a tail call,
|
|
|
|
The operator and operands are computed and placed in MACHINE.env's
|
|
scratch space.
|
|
|
|
The existing call frame is reused.
|
|
The frame's environment consumes those elements from MACHINE.env
|
|
MACHINE.env = the new stack segment
|
|
|
|
|
|
|
|
|
|
|
|
Optimizations with IL
|
|
|
|
The sequence PushEnvironment ... AssignImmediateStatement (EnvLexicalAddress ...)
|
|
where we're assigning directly to a spot we just allocated, can be reduced to
|
|
a single instruction.
|
|
|
|
We can do some constant folding in operands. e.g.
|
|
|
|
MACHINE.env[MACHINE.env.length - 1 - 3] = MACHINE.env[MACHINE.env.length - 1 - 7];
|
|
|
|
=>
|
|
|
|
MACHINE.env[MACHINE.env.length - 4] = MACHINE.env[MACHINE.env.length - 8];
|
|
|
|
|
|
|
|
|
|
|
|
|
|
On tail calls, when we're reusing all of the arguments on the stack,
|
|
there's no need to splice, since we won't be popping anything off:
|
|
|
|
MACHINE.env.splice(MACHINE.env.length - (MACHINE.argcount + ((10) - MACHINE.argcount)), ((10) - MACHINE.argcount));
|
|
|
|
is a no-op.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
In the case where a closure has a prefix, but all the uses of the prefix are to open-coded primitives, then we don't need to close over it after all. e.g.
|
|
|
|
(test '(begin (letrec ([f (lambda (x) (* x x))]
|
|
[g (lambda (x) (* x x x))])
|
|
(- (g (f (+ (g 3) (f 3)))) 1)))
|
|
2176782335
|
|
#:debug? #t)
|
|
|
|
since (* -) are both open-coded, there's no need to capture the
|
|
prefix, and we can reduce some allocation.
|
|
|
|
|
|
|
|
|
|
I can eliminate the first instruction in the pair:
|
|
|
|
|
|
#(struct:AssignPrimOpStatement val #(struct:GetCompiledProcedureEntry))
|
|
#(struct:GotoStatement #(struct:Label lamEntry259))
|
|
|
|
|
|
since the val isn't even being used here... This is the case when we
|
|
statically know the lambda target.
|
|
|
|
|
|
- this is done now.
|
|
|
|
|
|
|
|
I can coalese
|
|
|
|
(PushEnvironment 1 #f)
|
|
(AssignPrimOpStatement (EnvLexicalReference 0 #f) (MakeCompiledProcedure 'lamEntry265 1 '(2 1) 'diff)
|
|
|
|
into a single statement.
|
|
|
|
|
|
|
|
|
|
|
|
If lambdas don't escape, then we can make their closures empty by
|
|
simply explicitly passing in the free arguments.
|
|
|
|
|
|
|
|
|
|
|
|
There's no good reason why the IL has both AssignImmediateStatement
|
|
and AssignPrimOpStatement. The distinction is artificial because I'm
|
|
allowing the RHS of assignments to use arbitrary expressions, since my
|
|
runtime (JavaScript) supports it. I should consolidate these
|
|
structures; it may allow me to remove a few more instructions (like
|
|
setting ControlLabel to 'val).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
flush-output must immediately yield control to the browser, because
|
|
the browser needs control back to display changes to the dom.
|
|
Basically, we're simulating an IO interrupt here...
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
April 17, 2011
|
|
|
|
The dynamic recomputation for gas is only controlling one parameter:
|
|
how many times to run the trampoline before bouncing off to the
|
|
browser. But we really have two parameters that need dynamic
|
|
computation
|
|
|
|
* FN: the number of function calls before invoking the trampoline.
|
|
|
|
FN is necessarily bounded above by the browser. The larger it
|
|
is, the more efficient the trampoline can be.
|
|
|
|
* TI: the number of trampoline invokations before yielding to the browser
|
|
|
|
Both of these should be under some dynamic controller. We want to
|
|
optimize the efficiency of the runtime. I don't know what the
|
|
function is, but we want to optimize the parameters FN and TI such
|
|
that it maximizes FN and minimizes TI, and yet gives us the browser
|
|
reactivity we want.
|
|
|
|
|
|
|
|
|
|
|
|
April 24, 2011
|
|
|
|
The variables for linkage and target are doing double duty, which is
|
|
showing up in the defintion for compilation, since there are cases
|
|
that shouldn't exist in there.
|
|
|
|
They really should be part of the same datatype which describes,
|
|
essentially, what the code's continuation should be doing next.
|
|
Target's describing where the value needs to be installed at the end
|
|
of this, and linkage describes how to jump into the continuation.
|
|
|
|
|
|
Return --- write value to val, pop off and jump according to
|
|
dynamic value on control context. Return context may be in tail
|
|
position or not.
|
|
|
|
Next --- write value to a particular target and continue on.
|
|
|
|
Label --- write value to a particular target and jump
|
|
unconditionally to labeled location.
|
|
|
|
|
|
|
|
The continuation may or may not be expecting multiple values.
|
|
|
|
Ignore: doesn't care how many values come back. Throw away values
|
|
if multiple values are passed in.
|
|
|
|
Any: receives multiple values, and ensures those values are on the
|
|
stack. If a single value is received, pushes it on the stack and
|
|
sets up argcount to 1.
|
|
|
|
N: must receive exactly N values. If there's a mismatch, raises a
|
|
runtime error.
|
|
|
|
|
|
|
|
Return will allow Any number of values to come back. It doesn't need
|
|
a separate multiple-value context.
|
|
|
|
Next expects either exactly 1 value to come back, or ignores. So it
|
|
needs an multiple-value context.
|
|
|
|
Label, too, expects exactly 1 value to come back, or ignores. So it
|
|
needs a mulitple-value context.
|
|
|
|
|
|
When we use apply-values, it'll compile the producer expression in an
|
|
Any context.
|
|
|
|
|
|
|
|
I'm going to simplify values a bit.
|
|
|
|
|
|
|
|
----------------------------------------------------------------------
|
|
|
|
April 28
|
|
|
|
|
|
Multiple values are handled in the following way now.
|
|
|
|
In a context that expects multiple values to be returned,
|
|
|
|
if n = 0, don't leave anything on the stack before jumping out
|
|
|
|
if n = 1, put the single value in the 'val register
|
|
|
|
if n > 1, put the first value in the 'val register, and leave the
|
|
rest (the n-1 values) on the value stack.
|
|
|
|
The context is then responsible for dealing with those multiple return
|
|
values.
|
|
|
|
The contexts are now of the following types:
|
|
|
|
'tail : keeps the values on the stack. Used specifically for tail return.
|
|
|
|
'drop-multiple : drops any extra values on the stack
|
|
|
|
'keep-multiple : keeps any number of values on the stack.
|
|
|
|
Natural : expects exactly n values. Errors out if this can't be the case.
|
|
|
|
|
|
|
|
There appears to be a bug in compile-splice regarding multiple value
|
|
contexts. I haven't yet fixed the bug. I need a test case. I need
|
|
to somehow create a splicing expression in the context of something
|
|
that expects multiple values back. I'm not exactly sure how to create
|
|
such a context.
|
|
|
|
|
|
Ok, I think I've been able to do this successfully. I lifted out the
|
|
code for emit-values-context-check-on-procedure-return so it's used
|
|
for both the returns from procedure call, as well as the calls from
|
|
the prompt splicing.
|
|
|
|
|
|
----------------------------------------------------------------------
|
|
|
|
|
|
May 13,
|
|
|
|
begin0 is still broken; I'm a bit unsatisfied with the way that it's
|
|
coded, and I know it's not correct yet.
|
|
|
|
The other thing that's I'm sure needs to be looked at again are the
|
|
implementation of prompts.
|
|
|
|
|
|
modules are almost in place. Needs a mechanism for loading modules on
|
|
the fly on the network, as well as some kind of predicable namespacing
|
|
mechanism. I think the compiler will need to include something like a
|
|
|
|
(current-module-name-canonizer)
|
|
|
|
which takes module names (symbol, path-string) and systematically
|
|
translates them to predictable identifiers. Anything refering to a
|
|
collection should be translated to
|
|
|
|
collects/...
|
|
|
|
Anything outside that should be given a name relative to some root.
|
|
One should be able to say:
|
|
|
|
root the translation at "/home/dyoo/work/js-sicp-5.5/examples"
|
|
|
|
where all translated paths are either from collections, or reachable
|
|
from the root. That way, we get predictable paths. |