Fastest way to do this?
am 29.11.2007 19:49:52 von eselk
Lets say I've got a list of 50,000 records, and the user has half of
them selected, not in any particular order. I need to update one
field on all selected records. There is a primary key on the table.
What would be the *fastest* way to update those records?
With an update query I'm able to update about 4,000 records per
second. Using a DAO recordset, in a loop, I'm only getting about 500
per second. So obviously an update query would be better, but in this
case there isn't a simple WHERE clause I can use.
My MDB file is on a shared network drive, and multiple users do have
it open at once, so I can't do anything that requires exclusive
access.
I've tried BeginTrans and CommitTrans... although I think I'm in
"transaction" mode by default anyway. I notice I have to call
CommitTrans after about 9,000 records, or the call to Edit fails with
an error code I can't find any info about (0x800A0BEC) and
FormatMessage can't give me any error text for it. Could be "The
required parameter was not provided on the command line or in the
configuration file. ", but that makes no since.
Re: Fastest way to do this?
am 29.11.2007 20:10:41 von Salad
eselk@surfbest.net wrote:
> Lets say I've got a list of 50,000 records, and the user has half of
> them selected, not in any particular order. I need to update one
> field on all selected records. There is a primary key on the table.
> What would be the *fastest* way to update those records?
>
> With an update query I'm able to update about 4,000 records per
> second. Using a DAO recordset, in a loop, I'm only getting about 500
> per second. So obviously an update query would be better, but in this
> case there isn't a simple WHERE clause I can use.
>
> My MDB file is on a shared network drive, and multiple users do have
> it open at once, so I can't do anything that requires exclusive
> access.
>
> I've tried BeginTrans and CommitTrans... although I think I'm in
> "transaction" mode by default anyway. I notice I have to call
> CommitTrans after about 9,000 records, or the call to Edit fails with
> an error code I can't find any info about (0x800A0BEC) and
> FormatMessage can't give me any error text for it. Could be "The
> required parameter was not provided on the command line or in the
> configuration file. ", but that makes no since.
There is a difference between a query and a recordset. So if you have
them "selected" I suppose that's a recordset. Could you not use an
update query instead?
Dim dbs As Database
Dim strSQL As String
set dbs = Currentdb
strSQL = "UPDATE Table1 SET Table1.SDate = Date() Where...;"
dbs.Execute strSQL,dbFailOnError
Re: Fastest way to do this?
am 29.11.2007 20:20:35 von eselk
On Nov 29, 12:10 pm, Salad wrote:
> es...@surfbest.net wrote:
> > Lets say I've got a list of 50,000 records, and the user has half of
> > them selected, not in any particular order. I need to update one
> > field on all selected records. There is a primary key on the table.
> > What would be the *fastest* way to update those records?
>
> There is a difference between a query and a recordset. So if you have
> them "selected" I suppose that's a recordset. Could you not use an
> update query instead?
Let me rephrase. By "selected" I mean the user has them highlighted
in a list, not to be confused with an SQL SELECT statement.
So I've got a list (or "grid control" or "List view"), that is
displaying 50,000 records to the user. The user selects half of these
records, in whatever order they want -- so it could be random/spotty.
Then the user clicks a button to do an action of the 25,000 records
they have selected. When they click this button I need to update 1
field on these 25,000 records.
An UPDATE query seems like the fastest way to update a batch of
records, but probably only if I can use a simple WHERE clause, right?
I don't imagine it would be fast if I have to have a huge WHERE clause
that includes the primary key (a numeric value) of each highlighted
record, right? I'll admit I haven't tried, mostly because I don't
think it would be fast, but also because it would take a bit of code
to create that SQL statement and some more code to issue more than one
statement incase it gets too long.
I'm thinking maybe someone else knows of yet another option? Maybe
updating a local snapshot or temp table first, saving a list of IDs
(primary keys), then doing an UPDATE query that includes a JOIN or
something like that? So I would put all of the record IDs I want to
update in a local/temp table, then issue an UPDATE query on the real
table that would only include everything in the temp table? Is that
something people often do, or something else even better?
Re: Fastest way to do this?
am 29.11.2007 21:17:26 von insomniux
On 29 nov, 20:20, es...@surfbest.net wrote:
> On Nov 29, 12:10 pm, Salad wrote:
>
> > es...@surfbest.net wrote:
> > > Lets say I've got a list of 50,000 records, and the user has half of
> > > them selected, not in any particular order. I need to update one
> > > field on all selected records. There is a primary key on the table.
> > > What would be the *fastest* way to update those records?
>
> > There is a difference between a query and a recordset. So if you have
> > them "selected" I suppose that's a recordset. Could you not use an
> > update query instead?
>
> Let me rephrase. By "selected" I mean the user has them highlighted
> in a list, not to be confused with an SQL SELECT statement.
>
> So I've got a list (or "grid control" or "List view"), that is
> displaying 50,000 records to the user. The user selects half of these
> records, in whatever order they want -- so it could be random/spotty.
> Then the user clicks a button to do an action of the 25,000 records
> they have selected. When they click this button I need to update 1
> field on these 25,000 records.
>
> An UPDATE query seems like the fastest way to update a batch of
> records, but probably only if I can use a simple WHERE clause, right?
> I don't imagine it would be fast if I have to have a huge WHERE clause
> that includes the primary key (a numeric value) of each highlighted
> record, right? I'll admit I haven't tried, mostly because I don't
> think it would be fast, but also because it would take a bit of code
> to create that SQL statement and some more code to issue more than one
> statement incase it gets too long.
>
> I'm thinking maybe someone else knows of yet another option? Maybe
> updating a local snapshot or temp table first, saving a list of IDs
> (primary keys), then doing an UPDATE query that includes a JOIN or
> something like that? So I would put all of the record IDs I want to
> update in a local/temp table, then issue an UPDATE query on the real
> table that would only include everything in the temp table? Is that
> something people often do, or something else even better?
Running access over the network usually is extremely slow. My solution
was to install a 'local user-interface' of the database on each PC
with linked tables to the central database (btw in my case not an
access database). The user-interface only had the forms/reports/
modules, but no tables. This gave an extreme boost in speed. As I
understand you have the solution already at hand in VBA using a
recordset, this could solve your speed problem.
Re: Fastest way to do this?
am 29.11.2007 21:24:04 von Salad
eselk@surfbest.net wrote:
> On Nov 29, 12:10 pm, Salad wrote:
>
>>es...@surfbest.net wrote:
>>
>>>Lets say I've got a list of 50,000 records, and the user has half of
>>>them selected, not in any particular order. I need to update one
>>>field on all selected records. There is a primary key on the table.
>>>What would be the *fastest* way to update those records?
>>
>>There is a difference between a query and a recordset. So if you have
>>them "selected" I suppose that's a recordset. Could you not use an
>>update query instead?
>
>
> Let me rephrase. By "selected" I mean the user has them highlighted
> in a list, not to be confused with an SQL SELECT statement.
>
> So I've got a list (or "grid control" or "List view"), that is
> displaying 50,000 records to the user. The user selects half of these
> records, in whatever order they want -- so it could be random/spotty.
> Then the user clicks a button to do an action of the 25,000 records
> they have selected. When they click this button I need to update 1
> field on these 25,000 records.
I'd hate to be the person that had to manually select 25000 records.
> An UPDATE query seems like the fastest way to update a batch of
> records, but probably only if I can use a simple WHERE clause, right?
> I don't imagine it would be fast if I have to have a huge WHERE clause
> that includes the primary key (a numeric value) of each highlighted
> record, right? I'll admit I haven't tried, mostly because I don't
> think it would be fast, but also because it would take a bit of code
> to create that SQL statement and some more code to issue more than one
> statement incase it gets too long.
Sub Where25000()
Dim rst As Recordset
Dim strW As String
Dim intCnt As Integer
Dim strSQL As String
Set rst = CurrentDb.OpenRecordset("Table1", dbOpenSnapshot)
Do While intCnt < 25000
strW = strW & rst!ID & ","
intCnt = intCnt + 1
rst.MoveNext
Loop
rst.Close
Set rst = Nothing
MsgBox "Selected records"
strW = "(" & Left(strW, Len(strW) - 1) & ")"
strSQL = "Select ID, CName From Table1 Where ID In " & strW
Set rst = CurrentDb.OpenRecordset(strSQL, dbOpenSnapshot)
rst.MoveLast
MsgBox rst.RecordCount
rst.Close
Set rst = Nothing
End Sub
If I set my counter limit to 25000, I was out of memory on my test.
10000 recs and it was quick.
>
> I'm thinking maybe someone else knows of yet another option? Maybe
> updating a local snapshot or temp table first, saving a list of IDs
> (primary keys), then doing an UPDATE query that includes a JOIN or
> something like that? So I would put all of the record IDs I want to
> update in a local/temp table, then issue an UPDATE query on the real
> table that would only include everything in the temp table? Is that
> something people often do, or something else even better?
Yeah, your ideas would be great if you could implement them. If you had
a method that would store the keys as they are selected (and removed if
deselected) I would think you could use a Update with join.
Using my code as an example, if you could scan the 25000 recs but break
it into sections of 10000 records then runing an update then looping
back for the next set of 10000 or until EOF you might find it very
acceptable.
Re: Fastest way to do this?
am 30.11.2007 03:40:32 von Fred Zuckerman
wrote in message
news:8361fcc7-ad03-46df-8b54-2779c9186aff@n20g2000hsh.google groups.com...
> Lets say I've got a list of 50,000 records, and the user has half of
> them selected, not in any particular order. I need to update one
> field on all selected records. There is a primary key on the table.
> What would be the *fastest* way to update those records?
>
> With an update query I'm able to update about 4,000 records per
> second. Using a DAO recordset, in a loop, I'm only getting about 500
> per second. So obviously an update query would be better, but in this
> case there isn't a simple WHERE clause I can use.
>
> My MDB file is on a shared network drive, and multiple users do have
> it open at once, so I can't do anything that requires exclusive
> access.
>
> I've tried BeginTrans and CommitTrans... although I think I'm in
> "transaction" mode by default anyway. I notice I have to call
> CommitTrans after about 9,000 records, or the call to Edit fails with
> an error code I can't find any info about (0x800A0BEC) and
> FormatMessage can't give me any error text for it. Could be "The
> required parameter was not provided on the command line or in the
> configuration file. ", but that makes no since.
How about the code below.
With the following assumptions:
The table has a field named [Data], defined as date/time.
The form is in datasheet view.
Private Sub Form_Current()
'Only run code if multiple records have been selected
If Me.SelHeight > 1 Then
'Ask for confirmation to make changes
If MsgBox("You have selected " & Me.SelHeight & " records." & vbCrLf &
_
"Do you want to update the Data on these records to today's
date?", vbYesNo) = vbYes Then
Dim rst As DAO.Recordset
Dim i As Integer
Set rst = Me.RecordsetClone
rst.MoveFirst
'Move to the first selected record
For i = 1 To (Me.SelTop - 1)
rst.MoveNext
Next
'Change the data for the selected records
For i = 1 To Me.SelHeight
rst.Edit
rst!Data = Date
rst.Update
rst.MoveNext
Next
rst.Close
Set rst = Nothing
End If
End If
End Sub
Let us know if this works and is fast enough.
I did NOT experiment with 50,000 records.
Fred Zuckerman
Re: Fastest way to do this?
am 30.11.2007 18:00:25 von eselk
On Nov 29, 7:40 pm, "Fred Zuckerman" wrote:
>
> How about the code below.
> With the following assumptions:
> The table has a field named [Data], defined as date/time.
> The form is in datasheet view.
>
> Private Sub Form_Current()
>
> 'Only run code if multiple records have been selected
> If Me.SelHeight > 1 Then
>
> 'Ask for confirmation to make changes
> If MsgBox("You have selected " & Me.SelHeight & " records." & vbCrLf &
> _
> "Do you want to update the Data on these records to today's
> date?", vbYesNo) = vbYes Then
> Dim rst As DAO.Recordset
> Dim i As Integer
> Set rst = Me.RecordsetClone
> rst.MoveFirst
>
> 'Move to the first selected record
> For i = 1 To (Me.SelTop - 1)
> rst.MoveNext
> Next
>
> 'Change the data for the selected records
> For i = 1 To Me.SelHeight
> rst.Edit
> rst!Data = Date
> rst.Update
> rst.MoveNext
> Next
>
> rst.Close
> Set rst = Nothing
>
> End If
> End If
> End Sub
>
> Let us know if this works and is fast enough.
> I did NOT experiment with 50,000 records.
> Fred Zuckerman
Thanks Fred. However, this is pretty much what I'm doing now, except
I have to skip records that aren't selected/highlighted (it isn't just
one continous block of selected records). I only get about 500
records per second, and maybe that is as fast as it can be, but an
update query can do 4,000 per second, so I thought I'd see if I can
get somewhere in between those 2 at least.
Re: Fastest way to do this?
am 30.11.2007 18:10:38 von eselk
On Nov 29, 1:17 pm, insomniux wrote:
> Running access over the network usually is extremely slow. My solution
> was to install a 'local user-interface' of the database on each PC
> with linked tables to the central database (btw in my case not an
> access database). The user-interface only had the forms/reports/
> modules, but no tables. This gave an extreme boost in speed. As I
> understand you have the solution already at hand in VBA using a
> recordset, this could solve your speed problem.- Hide quoted text -
Yes, in my mind this is the *only* way to go. I'm not using Access or
VBA as my front-end though so this isn't really related to my speed
issue, I'm writing an app using C++ which uses DAO, but I do my
testing and proto-type stuff in Access and using DAO the way I do
isn't much different than using it from VBA so my questions/comments
still apply to this news-group. Also, I always get almost identical
speed testing results in Access or in my C++ code (which makes since,
because Access 2000 is using DAO basicly the same way I do).
Anyway, even with an EXE I've found that it is a really bad idea to
have your "code" (VBA script, forms, whatever it may be) located on a
network share. You can use network monitor to see that Windows goes
out to the EXE/code a lot as your code is running (it doesn't just
keep the entire EXE in memory like one might think). This is basicly
what a "General Page Fault" is, when Windows wants to get your code,
but it is GONE/moved because the network connection was reset.
Although, that being said, I'd think Access would just keep a local
cache of the forms and scripts (or keep in memory) so it doesn't need
to access the network each time. It requires exclusive access to edit
code/forms/reports anyway, so not like they need to keep checking to
see if the code was modified by another user.
Re: Fastest way to do this?
am 30.11.2007 18:27:39 von Fred Zuckerman
wrote in message
news:0b0adf5d-23b1-47cf-986c-16ec1be63164@x69g2000hsx.google groups.com...
> On Nov 29, 7:40 pm, "Fred Zuckerman" wrote:
>>
>> How about the code below.
>> With the following assumptions:
>> The table has a field named [Data], defined as date/time.
>> The form is in datasheet view.
>>
>> Private Sub Form_Current()
>>
>> 'Only run code if multiple records have been selected
>> If Me.SelHeight > 1 Then
>>
>> 'Ask for confirmation to make changes
>> If MsgBox("You have selected " & Me.SelHeight & " records." &
>> vbCrLf &
>> _
>> "Do you want to update the Data on these records to
>> today's
>> date?", vbYesNo) = vbYes Then
>> Dim rst As DAO.Recordset
>> Dim i As Integer
>> Set rst = Me.RecordsetClone
>> rst.MoveFirst
>>
>> 'Move to the first selected record
>> For i = 1 To (Me.SelTop - 1)
>> rst.MoveNext
>> Next
>>
>> 'Change the data for the selected records
>> For i = 1 To Me.SelHeight
>> rst.Edit
>> rst!Data = Date
>> rst.Update
>> rst.MoveNext
>> Next
>>
>> rst.Close
>> Set rst = Nothing
>>
>> End If
>> End If
>> End Sub
>>
>> Let us know if this works and is fast enough.
>> I did NOT experiment with 50,000 records.
>> Fred Zuckerman
>
> Thanks Fred. However, this is pretty much what I'm doing now, except
> I have to skip records that aren't selected/highlighted (it isn't just
> one continous block of selected records). I only get about 500
> records per second, and maybe that is as fast as it can be, but an
> update query can do 4,000 per second, so I thought I'd see if I can
> get somewhere in between those 2 at least.
How are you selecting non-contiguous blocks of records?
Fred
Re: Fastest way to do this?
am 30.11.2007 19:10:53 von eselk
On Nov 29, 1:24 pm, Salad wrote:
> I'd hate to be the person that had to manually select 25000 records.
Yah, tell me about it, and the sad thing is there are usually easier
ways to do these things in the program. I guess some people would say
the program design is bad if users can't find or understand the easy
ways, but since I think my design is pretty good, I'll go ahead and
blame the end-user =] They could pay someone $10/hour that would do
something like "show all of todays orders only, now flag them all",
but instead they pay someone $6/hour that does "show the master list
of all orders, go through them one at a time and find the ones for
today and select them, now flag just the ones I selected". Even
though it took the person 2 hours to do that, they still expect the
database part to finish in a couple seconds.
> Sub Where25000()
> Dim rst As Recordset
> Dim strW As String
> Dim intCnt As Integer
> Dim strSQL As String
>
> Set rst = CurrentDb.OpenRecordset("Table1", dbOpenSnapshot)
> Do While intCnt < 25000
> strW = strW & rst!ID & ","
> intCnt = intCnt + 1
> rst.MoveNext
> Loop
> rst.Close
> Set rst = Nothing
> MsgBox "Selected records"
>
> strW = "(" & Left(strW, Len(strW) - 1) & ")"
> strSQL = "Select ID, CName From Table1 Where ID In " & strW
> Set rst = CurrentDb.OpenRecordset(strSQL, dbOpenSnapshot)
> rst.MoveLast
> MsgBox rst.RecordCount
>
> rst.Close
> Set rst = Nothing
>
> End Sub
Thanks, this one saves me some work, now I can take your code (thanks
again) and do some tests using this method at least. I'm thinking
this will probably be the method I go with, as long as it is pretty
fast, because it translates best to a SQL server if I ever decide to
go that route... although I might use stored procedures, it still
translates better than using recordsets.
> Yeah, your ideas would be great if you could implement them. If you had
> a method that would store the keys as they are selected (and removed if
> deselected) I would think you could use a Update with join.
Hopefully the large update query works well, but if not, at least I
have another option. Avoiding temp tables sounds nice. I will have
the ID numbers in memory though (I use them to paint the screen,
either selected or not), so at least I'm just looping through memory
to build the large update SQL statement.
I think the reason either method would be faster than looping through
a recordset is because the database engine (JET Red) must do some
optimizations when it knows it has several updates to do at once....
although I would have though using BeginTrans(action) would be the
same. I would think the large SQL statement would be slower because I
have to create it, and then JET has to parse it again, but I have seen
other cases where the SQL statement is still faster (that was MySQL
server though, and I can't recall what I was comparing it against), so
we'll see.
> Using my code as an example, if you could scan the 25000 recs but break
> it into sections of 10000 records then runing an update then looping
> back for the next set of 10000 or until EOF you might find it very
> acceptable.
Luckily I have the IDs in memory, so the loop is pretty simple/fast,
but yes, I'll still need to do this. My records are almost never in
order by ID when the user is looking at them, otherwise I could take
advantage of that, at least for blocks that were continous, like WHERE
(ID >= 10 AND ID <= 100) OR ID = 103 OR ...etc..
I read somewhere else that the WHERE IN is faster anyway, than the
WHERE <>= with a lot of ORs and ANDs.
Re: Fastest way to do this?
am 30.11.2007 19:17:19 von eselk
On Nov 30, 10:27 am, "Fred Zuckerman"
wrote:
> How are you selecting non-contiguous blocks of records?
I'm not using Access forms -- not using Access as my front-end/GUI.
My app is C++, so either Windows ListView control, Tree controls, Grid
controls, or other custom ones I've written myself. I still use
Access for my speed tests and for proto-typing, but I usually just do
a mock-up in VBA code or something, to simulate what the database work
would be like if the user had selected the records. Or I select them
contiguously, but write the code to handle as-if they were not.
Re: Fastest way to do this?
am 01.12.2007 18:11:41 von Salad
eselk@surfbest.net wrote:
> On Nov 29, 1:24 pm, Salad wrote:
> Luckily I have the IDs in memory, so the loop is pretty simple/fast,
> but yes, I'll still need to do this. My records are almost never in
> order by ID when the user is looking at them, otherwise I could take
> advantage of that, at least for blocks that were continous, like WHERE
> (ID >= 10 AND ID <= 100) OR ID = 103 OR ...etc..
I only have a gig mem on my machine so I errored out on 25K records.
>
> I read somewhere else that the WHERE IN is faster anyway, than the
> WHERE <>= with a lot of ORs and ANDs.
Where In is quite fast. Just remember to remove the trailing comma if
you use code to concatenate ids.
1,2,3,
becomes
1,2,3