# Practical Techniques for Searches on Encrypted Data——Literature Notes

## Section one: abstract & introduction

### Part one: Motivation:

in order to solve How to support searching functionality without any loss if data confidentiality.

#### Why searches on encrypted data?

• Searching on encrypted e-mails on servers
• Searching on encrypted files on file servers
• Searching on encrypted database

#### Why is this hard?

• perform computations on encrypted data is often hard
• security and functionality

### part two: methods:

five different schemes based on probabilistic searching(all positions with erroneous positions($l/2^m$), $l$ means the length of document）

### part three: the two main parts：

• cryptographic schemes for the problem of searching on encrypted data
• proofs of security for the resulting crypto systems.

### part four: Provable security:

• provide provably secure, untrusted server cannot get any information from the data storage
• provide query isolation for searchers, means limited information(searching results) for untrusted server
• provide controlled searching, cannot search other information without authorization
• provide Hidden queries, does not reveal the search words.

### part five: Efficiency

• Low space and communication overhead

#### Other information about the algorithm:

• O(n) for a document of length n, fast and simple
• efficient and practical

## section two: Background and Definition

### part one: two types of approaches:

• build up an index
• perform a sequential scan without index

comparation:

• index may be faster when the documents are large, but storing and updating the index can be substantial overhead.
• the index-based schemes require less sophisticated construction and which are more suitable for mostly-read-only data.

### part two: Definition

notiondescription
$R$-breaksattacks algorithm with resource specified by $R$
$R$-secureno algorithm can $R$-breaks it
Advantage of $A$distinguishing probability of $A$
$A$an arbitrary algorithm: $\{0,1\}^n \rightarrow \{0,1\}$
$G$:(t,e)-secureA pseudorandom generator: $\mathcal{K}_G \rightarrow S$
$F$:(t,q,e)-secureA pseudorandom function:$\mathcal{K}_F \times \mathcal{X} \rightarrow \mathcal{Y}$
$E$:(t,q,e)-secureA pseudorandom permutation:$\mathcal{K}_E \times \mathcal{Z} \rightarrow \mathcal{Z}$
$A$$Adv A =Pr[A(X) = 1]-Pr[A(Y) = 1] G$$Adv A =Pr[(A(G(U_{mathcal{K}_G}))) = 1]-Pr[A(U_S) = 1]$
$F$$Adv A =Pr[A^{F_k} =1]-Pr[A^R = 1] E$$Adv A =Pr[A^{E_k,E_k^{-1}} = 1] - Pr[A^{pi,pi^{-1}} = 1]$

## Section three： Solution with sequential scan

### Part one: Scheme I - The Basic Scheme

step one: generates a sequence of pseudorandom values $S_1,...,S_{\mathcal{l}}$ via the pseudorandom generator $G$. and each $S$ is $n - m$ bits long.

step two: then, take $S_i$ to set $T_i := <S_i,F_{k_i}(S_i)>$, the length of $T_i$ is the same as the length of a word($n$) , and let $L_i = S_i$ and $R_i = F_{k_i}(S_i)$

step three: output the ciphertext: $C_i := W_i \oplus T_i$

the Key value: $k_i$ is stored by Alice, not the server. so only Alice can generate the pseudorandom bits to encrypt and decrypt.

Encryption and Decryption as follows.

#### Problems with Basic Scheme

• Queries are not hidden, server learn word
• Query isolation is not satisfied, server learns K and search for arbitrary word.

In other words, Alice must reveal all the $k_i$(thus potentially revealing the entire document), or Alice must know in advance which locations $W$ may appear at.

### Part two： Scheme II - Controlled Searching

Instead of $R_i \leftarrow F_k(L_i)$, just generate $S_i$ where $K_i = F_k^{'}(W_i)$.

$T_i := <R_i,F_{k_i}(R_i)>, where K_i = F_k{'}(W_i)$ .

So there an additional pseudorandom function: $f : \mathcal{K}_F \times \{0,1\}^* \rightarrow \mathcal{K}_F$, using this function to choose keys as $k_i := f_{k^{'}}(W_i)$.

And there are many different ways to generate $k_i$:

• $k_i := f_{k^{'}}(<C,W>)$, where $C$ is chapter.
• set $k_i := f_l(<0,C>)$ and $l := f_{k^{'}}(<1,W_i>)$,

• for each chapter, to reveal $k_i := f_{f_{f^{'}}(<1,W>)}(<0,C>)$
• for all chapter, to reveal $l := f_{k^{'}}(<1,W_i>)$

### Part three: Scheme III - Hidden Searches

pre-encrypt each word W of the clear text separately using a deterministic encryption algorithm $E_{k^{"}}$

About the $E_{k^{"}}$:

• not allowed to use any randomness.
• the computation $E_{k^{"}}(x)$ may depend only on $x$ and not the position $i$

Hidden Queries schemes and improved Security(Change K):  ### Part four: Scheme IV - Final Scheme

in order to fix a problem that Alice maybe bot decrypt without knowing $E_{k^{"}}(W_i)$ or last $m$ bits of it.

there is a simple fix:

• split the pre-encrypted word $X_i := E_{k^{"}}(W_i)$ into two parts$<L_i,R_i>$
• Instead of generating $k_i := X_i$, generate $k_i := E_{k^{"}}(L_i)$

Encryption and decryption as follows:  ## Section four: Variable length words encryption scheme

in order to deal with the fact that English words differ in length: 