RodolfoCarvalho.net

Tuesday, May 29, 2012

Compiling SQLite as a shared library on Ubuntu

I wanted to have the most recent SQLite available for my Python applications.
Ubuntu repositories couldn't help, so, for future reference, here is what I did:

cd /tmp

wget http://www.sqlite.org/sqlite-autoconf-3071201.tar.gz

tar xvzf sqlite-autoconf-3071201.tar.gz

cd sqlite-autoconf-3071201/

# set your own options

CFLAGS="-Os -DSQLITE_ENABLE_FTS3 -DSQLITE_ENABLE_RTREE" ./configure --prefix=/usr

make

# quick-n-dirty way to replace the original lib

sudo mv /usr/lib/x86_64-linux-gnu/libsqlite3.so.0.8.6{,.orig}
sudo mv /usr/bin/sqlite3{,.orig}

chmod -x .libs/libsqlite3.so.0.8.6
sudo cp .libs/libsqlite3.so.0.8.6 /usr/lib/x86_64-linux-gnu/
sudo cp .libs/sqlite3 /usr/bin/

cd

# check that Python sees the new version

python -c 'import sqlite3; print sqlite3.sqlite_version'

# check that SQLite shell works
sqlite3

Sunday, May 27, 2012

How to select random rows from a SQLite table fast

Today I wanted to select some random rows from a large (~3GB, ~900K rows) SQLite database, where the approach found elsewhere on the Web is simply too slow:

SELECT * FROM table ORDER BY random() LIMIT n;

Instead of doing that, I came up with a faster alternative:

SELECT * FROM table WHERE random() % k = 0 LIMIT n;

In the examples above, adjust *, table, k and n as suitable for you. n is the (maximum) number of rows returned, and k is an integer constant that determines how probable it is to select a given row.
For instance, k = 2 means (about) 0.5 probability, k = 3 means 0.33, k = 4 means 0.25 and so on.

My alternative above will return random rows, but sorted by the primary key. If you want random rows in random order you can save the retrieved rows in a temporary table and then shuffle them:

CREATE TEMP TABLE temp_rows AS
SELECT * FROM table WHERE random() % k = 0 LIMIT n;
SELECT * FROM temp_rows ORDER BY random();

That went faster than a compound select.

Thursday, May 10, 2012

Ser uma pessoa melhor a cada dia

Como inspiração para todos os dias dar o melhor de si, segue uma mensagem postada pelo meu ex-COMCIA (viva o Colégio Naval!) Emerson. Grifei algumas partes:

Não consigo achar nada mais gratificante do que chegar ao final de cada dia com a certeza de que fiz o melhor que pude no trabalho, tratei o melhor possível as pessoas com as quais convivo, dei a maior atenção e carinho que pude à minha família.
Se o mundo é cão e nada presta, pelo menos hoje a minha parte eu fiz pra mudar isso, e assim ir pro travesseiro com a consciência tranquila!
O melhor e o pior é que amanhã começa o desafio de novo, do zero!!
Não sei se conseguirei me safar mais uma vez, mas vou tentar, e assim seguir tentando, até o esforço virar hábito.
Boa noite a todos!!!
(Emerson Serafim)

Sunday, February 5, 2012

Conversão de unidades com Racket

Um rápido post para compartilhar um simples conversor de unidades escrito em Racket.
Resolvi escrever depois de ver a ideia de usar unidades junto dos valores no tutorial oficial do Erlang.

No exemplo, vamos converter centímetros em polegadas ou vice-versa. É fácil estender a ideia para converter outras unidades, como temperatura, pressão, moeda, etc.

O principal é só isso aqui:

(define (convert-length value)
  (match value
    [`(,(? number? v) centimeter) `(,(* v 2.54) inch)]
    [`(,(? number? v) inch) `(,(/ v 2.54) centimeter)]))

Ou seja, criamos uma função chamada convert-length que recebe um valor. O interessante é que esse valor não é um tipo numérico, mas sim uma estrutura de dados que contém um valor numérico e uma unidade.
Com isso, podemos ter uma única função de conversão, e sabemos sempre exatamente com que unidade estamos trabalhando.

O que é feito na função é usar casamento de padrões (pattern matching), uma funcionalidade padrão do Racket e bastante poderosa para trabalhar com estruturas de dados, para realizar a conversão apropriada de acordo com o valor de entrada.

[`(,(? number? v) centimeter) ; padrão
 `(,(* v 2.54) inch)] ; valor retornado

O primeiro padrão casa com uma lista formada por um número, armazenado na variável v, e o símbolo literal centimeter. A segunda parte da cláusula computa a conversão e cria uma estrutura com a nova unidade de medida.

O código completo:

#lang racket
(require rackunit)

;; Idea from the Erlang Tutorial
;; http://www.erlang.org/doc/getting_started/seq_prog.html#id64621
(define (convert-length value)
  (match value
    [`(,(? number? v) centimeter) `(,(* v 2.54) inch)]
    [`(,(? number? v) inch) `(,(/ v 2.54) centimeter)]
    [_ (error "Wrong value format. Value must be a list of two elements: a number and an unit, inch or centimeter.")]))

;; From Matthias Felleisein
;; http://lists.racket-lang.org/users/archive/2011-November/049154.html
(define (tee tag v)
  (displayln `(,tag ,v))
  v)

;;----------------------------------------------------------------------
;; Tests
(check-equal?
 (tee 'test-1 (convert-length '(1 centimeter)))
 '(2.54 inch))

(check-equal?
 (tee 'test-2 (convert-length '(2.54 inch)))
 '(1.0 centimeter))

(check-equal?
 (tee 'test-3 (convert-length (convert-length '(1 centimeter))))
 '(1.0 centimeter))

(check-exn exn:fail?
 (λ ()
   (convert-length 3.4)))

Friday, January 6, 2012

Some Computer Science Terminology

[Este post é em inglês. Se não entender, use um tradutor. Os links para a Wikipedia podem ser consultados em suas versões em português, portanto você ainda pode tirar algo proveito daqui.]

I wrote this as an email to someone special, but then I thought it could be shared here.

It is simply a few links and comments for topics we were talking about one night... you might get interested as well -- and feel free to join the conversation.

Hello dear reader!

Words in bold are terms that are interesting to be familiarized with, IMHO.

Context

http://en.wikipedia.org/wiki/Context_(computing)

(Additional term linked above: State http://en.wikipedia.org/wiki/State_(computer_science))

Environment

Note that in Scheme an environment is a first-class object, which means it can be assigned to a variable, it can be passed as argument, it can be used just like a number or string, etc etc.

Note also how Common Lisp and Scheme (and most other programming languages that I am familiar with and that I can think about now) differ in number of environments. CL is said to be a Lisp-2, while Scheme is Lisp-1. It means that Scheme has only one lexical environment containing all functions and variables, while CL has two distinct environments, one only for functions and another, independent one, for variables.

You can also think of the terms scope and namespace, which can appear to play a similar role in several languages.

http://www.gnu.org/software/mit-scheme/documentation/mit-scheme-ref/Environment-Operations.html

http://en.wikipedia.org/wiki/Scope_(programming)

http://en.wikipedia.org/wiki/Namespace_(programming)

Turing Machine

Have in mind how little you need to have a computing machine (and think of what does it mean to "compute," by the way). Have in mind that the Turing Machine is not practical as a real computer, but is of fundamental theoretical importance.

http://en.wikipedia.org/wiki/Turing_machine

http://en.wikipedia.org/wiki/Alan_Turing

(John) von Neumann

http://en.wikipedia.org/wiki/John_von_Neumann

http://en.wikipedia.org/wiki/Von_Neumann_architecture

Páginas