| |
In The second ACM SIGPLAN conference on History of programming languages, Vol. 28, No. 3. (March 1993), pp. 231-270, doi:10.1145/154766.155373
Abstract
Lisp is the world's greatest programming language—or so its proponents think. The structure of Lisp makes it easy to extend the language or even to implement entirely new dialects without starting from scratch. Overall, the evolution of Lisp has been guided more by institutional rivalry, one-upsmanship, and the glee born of technical cleverness that is characteristic of the “hacker culture” than by sober assessments of technical requirements. Nevertheless this process has eventually produced both an industrial-strength programming language, messy but powerful, ...
|
| |
In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pp. 858-867
|
| |
Abstract
The success of the von Neumann model of sequential computation is attributable to the fact that it is an efficient bridge between software and hardware: high-level languages can be efficiently compiled on to this model; yet it can be effeciently implemented in hardware. The author argues that an analogous bridge between software and hardware in required for parallel computation if that is to become as widely used. This article introduces the bulk-synchronous parallel (BSP) model as a candidate for this role, ...
|
| |
Abstract
Many practical computing problems concern large graphs. Standard examples include the Web graph and various social networks. The scale of these graphs - in some cases billions of vertices, trillions of edges - poses challenges to their efficient processing. In this paper we present a computational model suitable for this task. Programs are expressed as a sequence of iterations, in each of which a vertex can receive messages sent in the previous iteration, send messages to other vertices, and modify its ...
|
| |
In OOPSLA '04: Proceedings of the 19th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications (2004), pp. 50-68, doi:10.1145/1028976.1028982
Abstract
Tracing and reference counting are uniformly viewed as being fundamentally different approaches to garbage collection that possess very distinct performance properties. We have implemented high-performance collectors of both types, and in the process observed that the more we optimized them, the more similarly they behaved - that they seem to share some deep structure. ...
|
| |
|
| |
Abstract
Many interactive computing environments provide automatic storage reclamation and virtual memory to ease the burden of managing storage. Unfortunately, many storage reclamation algorithms impede interaction with distracting pauses. Generation Scavenging is a reclamation algorithm that has no noticeable pauses, eliminates page faults for transient objects, compacts objects without resorting to indirection, and reclaims circular structures, in one third the time of traditional approaches. We have incorporated Generation Scavenging in Berkeley Smalltalk(BS), our Smalltalk-80 implementation, and ...
|
| |
Information Processing Letters, Vol. 25, No. 4. (1987), pp. 275-279
Abstract
A very old and simple algorithm for garbage collection gives very good results when the physical memory is much larger than the number of reachable cells. In fact, the overhead associated with allocating and collecting cells from the heap can be reduced to less than one instruction per cell by increasing the size of physical memory. Special hardware, intricate garbage-collection algorithms, and fancy compiler analysis become unnecessary. ...
|
| |
(15 January 2003)
posted to common-lisp compiler design
by brothers
on 2010-12-20 06:17:41
Abstract
This report documents internal details of the CMU Common Lisp compiler and run-time system. CMU Common Lisp is a public domain implementation of Common Lisp that runs on various Unix workstations. This document is a work in progress: neither the contents nor the presentation are completed. Nevertheless, it provides some useful background information, in particular regarding the CMUCL compile ...
|
| |
(04 December 1991)
Abstract
This is an overview of classical artificial intelligence (AI) programming via actual implementation of landmark systems (case studies). For the student interested in AI, _Paradigms of Artificial Intelligence Programming_ is an invaluable history lesson. Even the programmer who is relatively uninterested in AI will find value in the book's basic introduction to Lisp and case studies written in Lisp. But perhaps the book's best feature is its information on efficiency considerations in Lisp. _Paradigms of Artificial Intelligence Programming_ is worth purchasing for these discussions alone, which provide a wealth ...
|
| |
(02 November 1995)
Abstract
This book provides an excellent introduction to Common Lisp. In addition to chapters covering the basic language concepts, there are sections discussing the Common Lisp object system (CLOS) and speed considerations in Lisp. Three fair-sized examples of nontrivial Lisp projects are also included. The book's clear and engaging format explains complicated constructs simply. This format makes ANSI Common Lisp accessible to a general audience--even those who have never programmed before. The book also provides an excellent perspective on the value of ...
|
| |
|
| |
(24 Oct 2010)
Abstract
We present two novel approaches to parsing context-free languages. The first approach is based on an extension of Brzozowski's derivative from regular expressions to context-free grammars. The second approach is based on a generalization of the derivative to parser combinators. The payoff of these techniques is a small (less than 250 lines of code), easy-to-implement parsing library capable of parsing arbitrary context-free grammars into lazy parse forests. Implementations for both Scala and Haskell are provided. Preliminary experiments with S-Expressions parsed millions of tokens per second, which suggests this technique ...
|
| |
(30 July 2003)
Abstract
During the past decade there has been an explosion in computation and information technology. With it has come vast amounts of data in a variety of fields such as medicine, biology, finance, and marketing. The challenge of understanding these data has led to the development of new tools in the field of statistics, and spawned new areas such as data mining, machine learning, and bioinformatics. Many of these tools have common underpinnings but are often expressed with different terminology. This book describes the important ideas in these areas in ...
|
| |
Abstract
Computing economics are changing. Today there is rough price parity between: (1) one database access; (2) 10 bytes of network traffic; (3) 100,000 instructions; (4) 10 bytes of disk storage; and (5) a megabyte of disk bandwidth. This has implications for how one structures Internet-scale distributed computing: one puts computing as close to the data as possible in order to avoid expensive network traffic. ...
|
| |
(1996)
Abstract
this paper. Note that the cons operation supplied by this library is strict, not lazy. In fact, the only lazy operations in this library are ++ (infix append) and reverse. 2 FIFO Queues Stacks and queues are usually the first two data structures studied by beginning computer science students. The typical imperative implementation of (unbounded) stacks as linked lists translates very naturally to a functional setting. However, the typical imperative implementation of (unbounded) queues as linked lists does not because it ...
|
| |
Abstract
Dynamic memory management is one of the most expensive but ubiquitous operations in many C/C++ applications. Additional features such as security checks, while desirable, further worsen memory management overheads. With advent of multicore architecture, it is important to investigate how dynamic memory management overheads for sequential applications can be reduced. In this paper, we propose a new approach for accelerating dynamic memory management on multicore architecture, by offloading dynamic management functions to a separate thread that we refer to as memory management thread (MMT). We show that an efficient MMT design can ...
|
| |
In POPL '09: Proceedings of the 36th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages (2009), pp. 177-185, doi:10.1145/1480881.1480905
Abstract
Parallel programs on lists have been intensively studied. It is well known that associativity provides a good characterization for divide-and-conquer parallel programs. In particular, the third homomorphism theorem is not only useful for systematic development of parallel programs on lists, but it is also suitable for automatic parallelization. The theorem states that if two sequential programs iterate the same list leftward and rightward, respectively, and compute the same value, then there exists a divide-and-conquer parallel program that computes the same value ...
|
| |
posted to divide-and-conquer parallelism
by brothers
on 2010-03-31 06:12:04
Abstract
Divide-and-conquer algorithms are suitable for modern parallel machines, tending to have large amounts of inherent parallelism and working well with caches and deep memory hierarchies. Among others, list homomorphisms are a class of recursive functions on lists, which match very well with the divide-and-conquer paradigm. However, direct programming with list homomorphisms is a challenge for many programmers. In this paper, we propose and implement a novel system that can automatically derive costoptimal list homomorphisms from a pair of sequential programs, based ...
|
| |
(2008)
posted to activity-stream social-network
by brothers
on 2010-03-16 12:40:02
Abstract
Social navigation usage on the Social Web were studied by conducting content analyzes to see how prevalent such navigation is now compared to the Web's earlier years. The common characteristic of the types of social navigation we found in these sites were the reliance on peers for the information used in the navigation process. We therefore built on existing definitions of social navigation an provided our own definition which emphasized the essentialness of peers in the web site one is navigating ...
|
| |
Abstract
A method for locating specific character strings embedded in character text is described and an implementation of this method in the form of a compiler is discussed. The compiler accepts a regular expression as source language and produces an IBM 7094 program as object language. The object program then accepts the text to be searched as input and produces a signal every time an embedded string in the text matches the given regular expression. Examples, problems, and solutions are also presented. ...
|
| |
Social Science Research Network Working Paper Series (31 May 2006)
Abstract
With an impressive theoretical foundation, normalization was supposed to bring rigor and relevance into such a slippery domain as database design is. Almost every database textbook treats normalization in a certain extent, usually suggesting that the topic is so clear and consolidated that it does not deserve deeper discussions. But the reality is completely different. After more than three decades, normalization not only has lost much of its interest in the research papers, but also is still looking for practitioners to ...
|
| |
In Seattle Tech Startups (9 September 2009)
posted to experiment microsoft web
by brothers
on 2009-09-16 02:14:42
Abstract
Knowledge Discovery and Data Mining techniques are now commonly used to find novel, potentially useful, patterns in data (Fayyad, et al., 1996; Chapman, et al., 2000). Most KDD applications involve post-hoc analysis of data and are therefore mostly limited to the identification of correlations. Recent seminal work on Quasi-Experimental Designs (Jensen, et al., 2008) attempts to identify causal relationships. Controlled experiments are a standard technique used in multiple fields. Through randomization and proper design, experiments allow establishing causality scientifically, which is ...
|
| |
posted to text-processing unix
by brothers
on 2009-09-10 12:50:47
|
| |
Abstract
A much beloved and widely used example showing the elegance and simplicity of lazy functional programming represents itself as “The Sieve of Eratosthenes.” This paper shows that this example is not the sieve and presents an implementation that actually is. ...
|
| |
(14 April 2008)
Abstract
Gigantically comprehensive and carefully researched, _Security Engineering_ makes it clear just how difficult it is to protect information systems from corruption, eavesdropping, unauthorised use and general malice. Better, Ross Anderson offers a lot of thoughts on how information can be made more secure (though probably not absolutely secure, at least not forever) with the help of both technologies and management strategies. His work makes fascinating reading, and will no doubt inspire considerable doubt--fear is probably a better choice of words--in anyone with information to gather, protect, or make decisions ...
|
| |
IEEE Transactions on Signal Processing In Signal Processing, IEEE Transactions on, Vol. 41, No. 12. (Dec 1993), pp. 3445-3462, doi:10.1109/78.258085
Abstract
The embedded zerotree wavelet algorithm (EZW) is a simple, yet remarkably effective, image compression algorithm, having the property that the bits in the bit stream are generated in order of importance, yielding a fully embedded code. The embedded code represents a sequence of binary decisions that distinguish an image from the “null” image. Using an embedded coding algorithm, an encoder can terminate the encoding at any point thereby allowing a target rate or target distortion metric to be met exactly. Also, given a bit stream, the decoder can cease ...
|
| |
In SODA '09: Proceedings of the Nineteenth Annual ACM -SIAM Symposium on Discrete Algorithms (2009), pp. 477-485
Abstract
Chazelle (JACM 47(6), 2000) devised an approximate meldable priority queue data structure, called Soft Heaps , and used it to obtain the fastest known deterministic comparison-based algorithm for computing minimum spanning trees, as well as some new algorithms for selection and approximate sorting problems. If n elements are inserted into a collection of soft heaps, then up to εn of the elements still contained in these heaps, for a given error parameter ε , may ...
|
| |
Abstract
A simple variant of a priority queue, called a soft heap, is introduced. The data structure supports the usual operations: insert, delete, meld, and findmin. Its novelty is to beat the logarithmic bound on the complexity of a heap in a comparison-based model. To break this information-theoretic barrier, the entropy of the data structure is reduced by artificially raising the values of certain keys. Given any mixed sequence of n operations, a soft heap with error rate ε (for any 0 ...
|
| |
In ACM Conference on Computer and Communications Security (1993), pp. 62-73
Abstract
We argue that the random oracle model ---where all parties have access to a public random oracle--- provides a bridge between cryptographic theory and cryptographic practice. In the paradigm we suggest, a practical protocol P is produced by first devising and proving correct a protocol P^R for the random oracle model, and then replacing oracle accesses by the computation of an "appropriately chosen" function h. This paradigm yields protocols much more efficient than standard ones while... ...
|
| |
Machine Learning, Vol. 24, No. 2. (1996), pp. 123-140
Abstract
Bagging predictors is a method for generating multiple versions of a predictor and using these to get an aggregated predictor. The aggregation averages over the versions when predicting a numerical outcome and does a plurality vote when predicting a class. The multiple versions are formed by making bootstrap replicates of the learning set and using these as new learning sets. Tests on real and simulated data sets using classification and regression trees and subset selection in linear... ...
|
| |
Abstract
Learning from examples is the process of taking input-output examples of an unknown function f and infering an implementation of f. Learning from hints allows for general information about f to be used instead of just input-output examples. We introduce a method for incorporating any invariance hint about f in any descent method for learning from examples. We also show that learning in a neural network remains NP-complete with a certain, biologically plausible, hint about the network. We discuss the information ...
|
| |
In Advances in Ultra-Dependable Distributed Systems, N. Suri, C. J. Walter, and M. M. Hugue (Eds.), IEEE Computer Society Press (1995)
Abstract
Reliable computer systems must handle malfunctioning components that give conflicting information to different parts of the system. This situation can be expressed abstractly in terms of a group of generals of the Byzantine army camped with their troops around an enemy city. Communicating only by messenger, the generals must agree upon a common battle plan. However, one of more of them may be traitors who will try to confuse the others. The problem is to find an algorithm to ensure that ...
|
| |
|
| |
|
| |
Pattern Analysis and Machine Intelligence, IEEE Transactions on, Vol. 22, No. 12. (06 December 2000), pp. 1349-1380, doi:10.1109/34.895972
Abstract
The paper presents a review of 200 references in content-based image retrieval. The paper starts with discussing the working conditions of content-based retrieval: patterns of use, types of pictures, the role of semantics, and the sensory gap. Subsequent sections discuss computational steps for image retrieval systems. Step one of the review is image processing for retrieval sorted by color, texture, and local geometry. Features for retrieval are discussed next, sorted by: accumulative and global features, salient points, object and shape features, ...
|
| |
Abstract
Virtual time is a new paradigm for organizing and synchronizing distributed systems which can be applied to such problems as distributed discrete event simulation and distributed database concurrency control. Virtual time provides a flexible abstraction of real time in much the same way that virtual memory provides an abstraction of real memory. It is implemented using the Time Warp mechanism, a synchronization protocol distinguished by its reliance on lookahead-rollback, and by its implementation of rollback via antimessages. ...
|
| |
Abstract
Inverted index structures are a core element of current text retrieval systems. They can be constructed quickly using offline approaches, in which one or more passes are made over a static set of input data, and, at the completion of the process, an index is available for querying. However, there are search environments in which even a small delay in timeliness cannot be tolerated, and the index must always be queryable and up to date. Here we describe and analyze a ...
|
| |
Abstract
The technology for building knowledge-based systems by inductive inference from examples has been demonstrated successfully in several practical applications. This paper summarizes an approach to synthesizing decision trees that has been used in a variety of systems, and it describes one such system, ID3, in detail. Results from recent studies show ways in which the methodology can be modified to deal with information that is noisy and/or incomplete. A reported shortcoming of the basic algorithm is discussed and two means of ...
|
| |
IEEE Transactions on Computers, Vol. 35 (1986), pp. 677-691
Abstract
In this paper we present a new data structure for representing Boolean functions and an associated set of manipulation algorithms. Functions are represented by directed, acyclic graphs in a manner similar to the representations introduced by Lee [1] and Akers [2], but with further restrictions on the ordering of decision variables in the graph. Although a function requires, in the worst case, a graph of size exponential in the number of arguments, many of the functions encountered in typical applications have ...
|
| |
In Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications, Vol. 31, No. 4. (October 2001), pp. 149-160, doi:10.1145/383059.383071
Abstract
A fundamental problem that confronts peer-to-peer applications is to efficiently locate the node that stores a particular data item. This paper presents Chord, a distributed lookup protocol that addresses this problem. Chord provides support for just one operation: given a key, it maps the key onto a node. Data location can be easily implemented on top of Chord by associating a key with each data item, and storing the key/data item pair at the node to which the key maps. Chord adapts efficiently as nodes join and leave ...
|
| |
In 2nd Symposium on Operating Systems Design and Implementation (OSDI '96), October 28--31, 1996. Seattle, {WA} (1996), pp. 229-243
Abstract
This paper describes a mechanism by which an operating system kernel can determine with certainty that it is safe to execute a binary supplied by an untrusted source. The kernel first defines a safety policy and makes it public. Then, using this policy, an application can provide binaries in a special form called proof-carrying code, or simply PCC. Each PCC binary contains, in addition to the native code, a formal proof that the code obeys the safety policy. The kernel can easily validate the... ...
|
| |
IEEEash ACM Transactions on Networking, Vol. 1, No. 4. (1993), pp. 397-413
Abstract
This paper presents Random Early Detection (RED) gateways for congestion avoidance in packetswitched networks. The gateway detects incipient congestion by computing the average queue size. The gateway could notify connections of congestion either by dropping packets arriving at the gateway or by setting a bit in packet headers. When the average queue size exceeds a preset threshold, the gateway drops or marks each arriving packet with a certain probability, where the exact probability is a... ...
|
| |
In Proceedings of the 1988 ACM SIGMOD international conference on Management of data (1988), pp. 109-116, doi:10.1145/50202.50214
Abstract
Increasing performance of CPUs and memories will be squandered if not matched by a similar performance increase in I/O. While the capacity of Single Large Expensive Disks (SLED) has grown rapidly, the performance improvement of SLED has been modest. Redundant Arrays of Inexpensive Disks (RAID), based on the magnetic disk technology developed for personal computers, offers an attractive alternative to SLED, promising improvements of an order of magnitude in performance, reliability, power consumption, and scalability. This paper introduces five levels of ...
|
| |
Abstract
The programming of a proof procedure is discussed in connection with trial runs and possible improvements. ...
|
| |
Abstract
There is a deep and useful connection between statistical mechanics (the behavior of systems with many degrees of freedom in thermal equilibrium at a finite temperature) and multivariate or combinatorial optimization (finding the minimum of a given function depending on many parameters). A detailed analogy with annealing in solids provides a framework for optimization of the properties of very large and complex systems. This connection to statistical mechanics exposes new information and provides an unfamiliar perspective on traditional optimization problems and ...
|
| |
|
| |
Abstract
Computer vision is embracing a new research focus in which the aim is to develop visual skills for robots that allow them to interact with a dynamic, realistic environment. To achieve this aim, new kinds of vision algorithms need to be developed which run in real time and subserve the robot's goals. Two fundamental goals are determining the location of a known object. Color can be successfully used for both tasks. ...
|
| |
Abstract
An encryption method is presented with the novel property that publicly revealing an encryption key does not thereby reveal the corresponding decryption key. This has two important consequences: 1. Couriers or other secure means are not needed to transmit keys, since a message can be enciphered using an encryption key publicly revealed by the intended recipient. Only he can decipher the message, since only he knows the corresponding decryption key. 2. A message can be “signed” using a privately held decryption ...
|
| |
Abstract
MultiJava is a conservative extension of the Java programming language that adds symmetric multiple dispatch and open classes. Among other benefits, multiple dispatch provides a solution to the binary method problem. Open classes provide a solution to the extensibility problem of object-oriented programming languages, allowing the modular addition of both new types and new operations to an existing type hierarchy. This article illustrates and motivates the design of MultiJava and describes its modular static typechecking and modular compilation strategies. Although MultiJava ...
|