what is a phantom read ?

G

gk

Here is a Phantom read example I read:

/* Query 1 */

SELECT * FROM users
WHERE age BETWEEN 10 AND 30;

return 2 records.



/* Query 2 */
INSERT INTO users VALUES ( 3, 'Bob', 27 );
COMMIT;



/* Query 1 */
SELECT * FROM users
WHERE age BETWEEN 10 AND 30;


return 3 records.



see It seems to me normal . I don't understand why they are called
'Phantom read' . Look , all transactions are happening in different
time , so we are getting the latest data always. Initially there
were 2 records , Later on , 1 record inserted ..so when we run Query 1
again, we get the updated data i.e 3 records.

So,what is wrong here ? what to be worried here ? why its called
phantom read ?
 
J

Jeff Higgins

Here is a Phantom read example I read:

/* Query 1 */

SELECT * FROM users
WHERE age BETWEEN 10 AND 30;

return 2 records.



/* Query 2 */
INSERT INTO users VALUES ( 3, 'Bob', 27 );
COMMIT;



/* Query 1 */
SELECT * FROM users
WHERE age BETWEEN 10 AND 30;


return 3 records.



see It seems to me normal . I don't understand why they are called
'Phantom read' . Look , all transactions are happening in different
time , so we are getting the latest data always.

What has time got to do with it?

Initially there
were 2 records , Later on , 1 record inserted ..so when we run Query 1
again, we get the updated data i.e 3 records.

So,what is wrong here ? what to be worried here ? why its called
phantom read ?

Did you read the entire article?
<http://en.wikipedia.org/wiki/Isolation_(database_systems)>
 
L

Lew

Pay attention here, gk!

The key word here is "entire". You might otherwise miss
"Note that *transaction 1 executed the same query twice*. [emph. orig.] If the
highest level of isolation were maintained, the same set of rows should be
returned both times, and indeed that is what is mandated to occur in a
database operating at the SQL SERIALIZABLE isolation level. However, at the
lesser isolation levels, a different set of rows may be returned the second time."

The fact that the same query returns different results at lesser levels means
that one or both results are "phantoms" - not the real answer.
it seems to me perfectly normal . Do you see any trouble in this
scenario ? I don't understand where is the trouble yet . why the
trouble will come up ?

The part you're missing is that the two queries occur *inside the same
transaction*. That's *inside the same transaction*. It's the fact that it's
the *same* transaction getting different results that makes it a "problem".
If the isolation level is low, then the transaction is not isolated (get it?)
from the effects of the other transaction. Were the two queries in different
transactions the isolation level would be irrelevant, but they're in the same
transaction.

This is not to say you always need repeatable-read isolation, but when you do,
phantom reads are a "problem".

Why might you need repeatable read? Well, if you're building intermediate
results, say bringing in a set of rows to process, you could get bizarre
results if that set changes while the transaction progresses. It's sort of
like a 'ConcurrentModificationException' in the collections classes. You
can't build a house on shifting sands.

If you could, you wouldn't bother putting the multiple queries in the *same
transaction*.
 
A

Andreas Leitgeb

gk said:
it seems to me perfectly normal . Do you see any trouble in this
scenario ? I don't understand where is the trouble yet . why the
trouble will come up ?

It may be normal, but there are also other definitions of "normal".
Also, a phantom isn't necessarily something abnormal.

The phantomity lies in that you get something back from a read, but
cannot be sure that that thing you just read is still there exactly
the same way the very next nanosecond.

Other isolation levels, otoh, will guarantee, that what you saw
once (exactly those two lines), you'll see (or be able to update/
delete) anytime later until *your* session does a commit or rollback.

In Java, access to shared variables (e.g. fields of instances known
to more than one thread) are also like phantom reads, unless all
writing threads agree on respecting some particular lock: then some
thread holding that lock will get repeatable reads until it drops
the lock.
 
G

gk

Pay attention here, gk!

The key word here is "entire".  You might otherwise miss
"Note that *transaction 1 executed the same query twice*. [emph. orig.] If the
highest level of isolation were maintained, the same set of rows should be
returned both times, and indeed that is what is mandated to occur in a
database operating at the SQL SERIALIZABLE isolation level. However, at the
lesser isolation levels, a different set of rows may be returned the second time."

The fact that the same query returns different results at lesser levels means
that one or both results are "phantoms" - not the real answer.
it seems to me perfectly normal . Do you see any trouble in this
scenario ? I don't understand where is the trouble yet . why the
trouble will come up ?

The part you're missing is that the two queries occur *inside the same
transaction*.  That's *inside the same transaction*.  It's the fact that it's
the *same* transaction getting different results that makes it a "problem".
If the isolation level is low, then the transaction is not isolated (get it?)
from the effects of the other transaction.  Were the two queries in different
transactions the isolation level would be irrelevant, but they're in the same
transaction.

This is not to say you always need repeatable-read isolation, but when you do,
phantom reads are a "problem".

Why might you need repeatable read?  Well, if you're building intermediate
results, say bringing in a set of rows to process, you could get bizarre
results if that set changes while the transaction progresses.  It's sort of
like a 'ConcurrentModificationException' in the collections classes.  You
can't build a house on shifting sands.

If you could, you wouldn't bother putting the multiple queries in the *same
transaction*.

What I understand I am summarizing below

Yes. I see Query 1 has been executed two times in the same
transaction i.e Transaction 1.

same select query should return same results in same transaction
irrespective of number of execution. But we see first time select
query execution has got 2 records and second time select query
excution has returned 3 records though the execution are in the same
transaction . This is very bad . This can happen only when there is
low isolation level . Here we faced a low isolation problem and hence
we are getting this discrepancy. There is another trasaction i.e
Transaction 2 is interferring Transaction 1's results . And so we are
getting a wrong result sets in Trasaction 1. This is called phantom
read.

Whats the resolution then ? Do we have to do it anything from java
side . Or it will be taken care of database itself automatically ?
 
L

Lew

gk said:
Yes. I see Query 1 has been executed two times in the same
transaction i.e[.,] Transaction 1.

same select query should return same results in same transaction
Sometimes.

irrespective of number of execution. But we see first time select
query execution has got 2 records and second time select query
excution has returned 3 records though the execution are in the same
transaction . This is very bad . This can happen only when there is

It's sometimes bad. Not always.
low isolation level . Here we faced a low isolation problem and hence
we are getting this discrepancy. There is another trasaction i.e
Transaction 2 is interferring Transaction 1's results . And so we are
getting a wrong result sets in Trasaction 1. This is called phantom
read.
Correct.

Whats the resolution then ? Do we have to do it anything from java
side . Or it will be taken care of database itself automatically ?

Set the transaction isolation level for the database.

The answer, as usual, lies in the Javadocs:

<http://java.sun.com/javase/6/docs/api/java/sql/Connection.html#setTransactionIsolation(int)>
 
M

Mike Schilling

gk said:
Whats the resolution then ? Do we have to do it anything from java
side . Or it will be taken care of database itself automatically ?

Java is simply reporting what the DBMS returns. If you want stricter
isolation, you need to tell the DBMS to apply it (which you can do via
JDBC.)
 
G

gk

gk said:
Yes. I see Query 1 has been executed  two times in the same
transaction i.e[.,] Transaction 1.
same select query should return same results in same transaction
Sometimes.

irrespective of number of execution. But we see first time select
query execution has got 2 records and second time select query
excution has returned 3 records though the execution are in the same
transaction . This is very bad .  This can happen only when there is

It's sometimes bad.  Not always.
low  isolation level . Here we faced a low isolation problem and hence
we are getting this discrepancy. There is another trasaction i.e
Transaction 2 is interferring Transaction 1's  results . And so we are
getting a wrong result sets in Trasaction 1.  This is called phantom
read.
Correct.

Whats the resolution then ?  Do we have to do it anything from java
side . Or it will be taken care of database itself automatically ?

Set the transaction isolation level for the database.

The answer, as usual, lies in the Javadocs:

<http://java.sun.com/javase/6/docs/api/java/sql/Connection.html#setTra...)>

Interesting ...Yes . I can see 5 field attributes .

while coding , shall I do this ?

conn.setTransactionIsolation(conn.TRANSACTION_SERIALIZABLE);
//select record in table1
// insert record in table1
//select record in table1
conn.commit.


is there any other extra code I need ? please let me know .where and
how do I write the trasaction bengin and trasaction end in this code ?
 
L

Lew

gk said:
Whats the resolution then ? Do we have to do it anything from java [sic]
side . Or it will be taken care of database itself automatically ?
Set the transaction isolation level for the database.

The answer, as usual, lies in the Javadocs:

<http://java.sun.com/javase/6/docs/api/java/sql/Connection.html#setTra...)>

gk quoted the sig:
gk, please don't quote sigs.
Interesting ...Yes . I can see 5 field attributes .

while coding , shall I do this ?

conn.setTransactionIsolation(conn.TRANSACTION_SERIALIZABLE);

Don't dereference static members through the instance, dereference them
through the type. You should have written 'Connection.TRANSACTION_SERIALIZABLE'.
//select record in table1
// insert record in table1
//select record in table1
conn.commit.

That depends. You don't indicate where you wish the transactions to begin and
end, or even how many transactions you want.

You don't necessarily need the highest level of transaction isolation, that's
why there are more than one level.
is there any other extra code I need ? please let me know .where and
how do I write the trasaction bengin and trasaction end in this code ?
^----------/

For where, that depends on where you want to put the transaction boundaries.

For how, the answer, as usual, lies in the Javadocs:
<http://java.sun.com/javase/6/docs/api/java/sql/Connection.html#setAutoCommit(boolean)>
<http://java.sun.com/javase/6/docs/api/java/sql/Connection.html#commit()>
<http://java.sun.com/javase/6/docs/api/java/sql/Connection.html#rollback()>

Essentially, all three of those calls begin a transaction and the latter two
also end one.

Beyond that, have you considered reading the Java Tutorials? Google?
<http://java.sun.com/docs/books/tutorial/jdbc/index.html>

GIYF.

You can't learn a topic comprehensively very well only by asking questions in
Usenet on tiny details. You need an overview and a foundation. The materials
are out there; learn to use them. Get in the habit of using them. You will
make poor progress until you do.
 
G

gk

gk said:
Whats the resolution then ?  Do we have to do it anything from java [sic]
side . Or it will be taken care of database itself automatically ?
Lew said:
Set the transaction isolation level for the database.
The answer, as usual, lies in the Javadocs:
<http://java.sun.com/javase/6/docs/api/java/sql/Connection.html#setTra....)>

gk quoted the sig:

gk, please don't quote sigs.
Interesting ...Yes . I can see 5 field  attributes .
while coding , shall I do this ?
conn.setTransactionIsolation(conn.TRANSACTION_SERIALIZABLE);

Don't dereference static members through the instance, dereference them
through the type.  You should have written 'Connection.TRANSACTION_SERIALIZABLE'.
  //select  record in table1
  // insert record in table1
  //select  record in table1
conn.commit.

That depends.  You don't indicate where you wish the transactions to begin and
end, or even how many transactions you want.

You don't necessarily need the highest level of transaction isolation, that's
why there are more than one level.
is there any other extra code  I need ? please let me know .where and
how do I write the trasaction bengin and trasaction end in this code ?

                                   ^----------/

For where, that depends on where you want to put the transaction boundaries.

For how, the answer, as usual, lies in the Javadocs:
<http://java.sun.com/javase/6/docs/api/java/sql/Connection.html#setAut...)>
<http://java.sun.com/javase/6/docs/api/java/sql/Connection.html#commit()>
<http://java.sun.com/javase/6/docs/api/java/sql/Connection.html#rollback()>

Essentially, all three of those calls begin a transaction and the latter two
also end one.

Beyond that, have you considered reading the Java Tutorials?  Google?
<http://java.sun.com/docs/books/tutorial/jdbc/index.html>

GIYF.

You can't learn a topic comprehensively very well only by asking questions in
Usenet on tiny details.  You need an overview and a foundation.  The materials
are out there; learn to use them.  Get in the habit of using them.  You will
make poor progress until you do.

--
Lew
Light a man a fire and you warm him for an hour.
Set a man on fire and you warm him for the rest of his life.
Don't quote sigs.

Does PHANTOM READ and NON REPEATABLE READ are same thing ?

I read in Jguru.com about NON REPEATEABLE READ as follows

"....One of the ISO-ANSI SQL defined "phenomena" that can occur with
concurrent transactions. If one transaction reads a row, then another
transaction updates or deletes the row and commits, the first
transaction, on re-read, gets modified data or no data. This is an
inconsistency problem within a transaction and addressed by isolation
levels...."


BUT this is similar to PHANTOM READ we discussed so far !

Does NON REPEATEABLE READ and PHANTOM READ are same thing ? I don't
find the difference.
 
L

Lew

Don't quote sigs.

Do trim your posts.

Don't quote sigs.

Pay attention to what you're posting. Be conscious, and show some effort.

Don't quote sigs.
Does PHANTOM READ and NON REPEATABLE READ are same thing ?
http://en.wikipedia.org/wiki/Isolation_(database_systems)

I read in Jguru.com about NON REPEATEABLE READ as follows

"....One of the ISO-ANSI SQL defined "phenomena" that can occur with
concurrent transactions. If one transaction reads a row, then another
transaction updates or deletes the row and commits, the first
transaction, on re-read, gets modified data or no data. This is an
inconsistency problem within a transaction and addressed by isolation
levels...."


BUT this is similar to PHANTOM READ we discussed so far !

Does NON REPEATEABLE READ and PHANTOM READ are same thing ? I don't
find the difference.

The Wikipedia article up at the start of this thread addresses your question.

http://en.wikipedia.org/wiki/Isolation_(database_systems)
"Repeatable reads (phantom reads)"

 
M

Martin Gregorie

gk said:
Yes. I see Query 1 has been executed  two times in the same
transaction i.e[.,] Transaction 1.
same select query should return same results in same transaction
Sometimes.

irrespective of number of execution. But we see first time select
query execution has got 2 records and second time select query
excution has returned 3 records though the execution are in the same
transaction . This is very bad .  This can happen only when there is

It's sometimes bad.  Not always.
low  isolation level . Here we faced a low isolation problem and
hence we are getting this discrepancy. There is another trasaction
i.e Transaction 2 is interferring Transaction 1's  results . And so
we are getting a wrong result sets in Trasaction 1.  This is called
phantom read.
Correct.

Whats the resolution then ?  Do we have to do it anything from java
side . Or it will be taken care of database itself automatically ?

Set the transaction isolation level for the database.

The answer, as usual, lies in the Javadocs:

<http://java.sun.com/javase/6/docs/api/java/sql/ Connection.html#setTra...)>

Interesting ...Yes . I can see 5 field attributes .

while coding , shall I do this ?

conn.setTransactionIsolation(conn.TRANSACTION_SERIALIZABLE);
//select record in table1
// insert record in table1
//select record in table1
conn.commit.


is there any other extra code I need ? please let me know .where and
how do I write the trasaction bengin and trasaction end in this code ?
In your example you're: doing both reads and the insert in the same
transaction, so *of course* you'd expect the two selects to return
different results. On the other hand, if there were two transactions:

# 1st transaction: serialised # 2nd transaction: auto-commit
# explicit commit unit # implicit commit unit

conn.setTransactionIsolation
(conn.TRANSACTION_SERIALIZABLE);
//select record in table1
... // insert record in table1
//select record in table1
conn.commit.


...then in this case you'd be right to expect the two selects to return
the same data set because they are in the same serialisable commit unit
and the insert is outside it.

It doesn't matter whether the two transactions are run by separate
programs or both by the same program: that has no impact on transaction
isolation.
 
G

gk

gk wrote:
Yes. I see Query 1 has been executed  two times in the same
transaction i.e[.,] Transaction 1.
same select query should return same results in same transaction
Sometimes.
irrespective of number of execution. But we see first time select
query execution has got 2 records and second time select query
excution has returned 3 records though the execution are in the same
transaction . This is very bad .  This can happen only when there is
It's sometimes bad.  Not always.
low  isolation level . Here we faced a low isolation problem and
hence we are getting this discrepancy. There is another trasaction
i.e Transaction 2 is interferring Transaction 1's  results . And so
we are getting a wrong result sets in Trasaction 1.  This is called
phantom read.
Correct.
Whats the resolution then ?  Do we have to do it anything from java
side . Or it will be taken care of database itself automatically ?
Set the transaction isolation level for the database.
The answer, as usual, lies in the Javadocs:
<http://java.sun.com/javase/6/docs/api/java/sql/
Connection.html#setTra...)>





Interesting ...Yes . I can see 5 field  attributes .
while coding , shall I do this ?
conn.setTransactionIsolation(conn.TRANSACTION_SERIALIZABLE);
 //select  record in table1
 // insert record in table1
 //select  record in table1
conn.commit.
is there any other extra code  I need ? please let me know .where and
how do I write the trasaction bengin and trasaction end in this code ?

In your example you're: doing both reads and the insert in the same
transaction, so *of course* you'd expect the two selects to return
different results. On the other hand, if there were two transactions:

# 1st transaction: serialised           # 2nd transaction: auto-commit
# explicit commit unit                  # implicit commit unit

conn.setTransactionIsolation
   (conn.TRANSACTION_SERIALIZABLE);
 //select  record in table1
 ...                                     // insert record in table1
 //select  record in table1
conn.commit.

..then in this case you'd be right to expect the two selects to return
the same data set because they are in the same serialisable commit unit
and the insert is outside it.

It doesn't matter whether the two transactions are run by separate
programs or both by the same program: that has no impact on transaction
isolation.



Please see this

http://docs.google.com/View?id=dc83hzcs_380f8hkzdfb

I still don't see any difference ....both the problems are same i.e
PHANTOM READ and NON REPEATABLE READ's *Transaction 1 showing
different results in two runs of Query 1*

Where is the key difference between them then ?
 
L

Lew

Please see this

http://docs.google.com/View?id=dc83hzcs_380f8hkzdfb

I still don't see any difference ....both the problems are same i.e
PHANTOM READ and NON REPEATABLE READ's *Transaction 1 showing
different results in two runs of Query 1*

Where is the key difference between them then ?

Did you read my answer to this question?
Phantom reads are a consequence of not setting transaction isolation to at
least REPEATABLE READ.

You should read the answers to the questions you ask.

Please trim your posts.
 
G

gk

Phantom reads are a consequence of not setting transaction isolation to at
least REPEATABLE READ.

You should read the answers to the questions you ask.

Please trim your posts.

did you look at the link I posted .
http://docs.google.com/View?id=dc83hzcs_380f8hkzdfb

Those are taken from Wikipedia . What I am asking was to see the
Definition of PHANTOM READ and NON-REPEATABLE READ there. It seems
to me that both of them are discussing the same problem .Both of
them shows *Transaction 1 showing different results in two runs
of Query 1*

where is the key difference(if any) ?
 
L

Lew

gk said:
Those are taken from Wikipedia . What I am asking was to see the
Definition of PHANTOM READ and NON-REPEATABLE READ there. It seems
to me that both of them are discussing the same problem .Both of
them shows *Transaction 1 showing different results in two runs
of Query 1*

where is the key difference(if any) ?

Phantom read is a different number of rows returned from the query.
Non-repeatable read is different values returned for a given row.

Once again, this question was answered in the web page to which you've been
repeatedly directed:
http://en.wikipedia.org/wiki/Isolation_(database_systems)

specifically at
http://en.wikipedia.org/wiki/Isolation_(database_systems)#Repeatable_reads_.28phantom_reads.29
vs.
http://en.wikipedia.org/wiki/Isolation_(database_systems)#Read_Committed_.28Non-repeatable_reads.29
 
L

Lew

Lew said:
Phantom reads are a consequence of not setting transaction isolation to
at least REPEATABLE READ.

Oops. I mean at least *stricter than* REPEATABLE READ, i.e., SERIALIZABLE. I
apologize for the misinformation.
 
G

gk

Phantom read is a different number of rows returned from the query.

True . because in the example they are using "INSERT" in transaction
2

surprisingly , Phantom read also called "REPEATABLE READ" . Is it
because they are not MODIFYING the values but adding some extra
records into it ? i.e it repeates the values with some additional
records.

Non-repeatable read is different values returned for a given row.

True . because in the example they are using "UPDATE" in transaction
2.


However,both of them are showing faulty results at the end of
Transaction 1.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,820
Latest member
GilbertoA5

Latest Threads

Top