Monday, November 14, 2016

Large Data Volumes Interview Questions in Salesforce

1 : How Standard Fields and Custom Fields related information is saved inside Salesforce Database? Is every Standard and Custom Object is created as a different Database table?
Ans :
Salesforce is using Multi-tenant architecture, means many organizations (Tenants) are using same infrastructure. Salesforce Database saves Metadata Information in hundreds of table. Run time engine then generates organization specific query to get information about their organizations and Data from common table as shown in below diagram. Below Database tables are partitioned by Organization ID and generates virtual table specific to Org.
2 : As a Developer, how can you optimize SQL query to fetch data from Salesforce Database?
Ans :
As Salesforce doesn’t save data in traditional way. Data of all tenants are in common table, so traditional Query optimization query and technique will not work in this case, so there is no such tool available to optimize final generated SQL. We only have option to create SOQL which is optimized by custom inbuilt Force.com Query Optimizer.
In Summer14, Salesforce released Query Plan Tool to analyze how query is performing. With help of this tool, we can get an idea how we can change our query to perform better.

3 : When records are created in Salesforce, How it is queued for Indexing?
Ans :
If newly created records are equal to or less than 9000, then it will be indexed in 1 to 3 minutes. However if records are more than 9000, then servers perform bulk indexing at a lower priority, so processing might take longer.

4 : Explain functionality of Force.com Query Optimizer.
Ans :
The Force.com query optimizer:
  1. Determines the best index from which to drive the query, if possible, based on filters in the query
  2. Determines the best table to drive the query from if no good index is available
  3. Determines how to order the remaining tables to minimize cost
  4. Injects custom foreign key value tables as needed to create efficient join paths
  5. Influences the execution plan for the remaining joins, including sharing joins, to minimize database input/output (I/O)
  6. Updates statistics

5 : Explain term “Data Skew” in Salesforce.
Ans :
Scenario in which parent record has more than 10,000 of records or if any user owns more than 10,000 records is known as “Data Skew”. Salesforce does not recommend having more than 10,000 records own by any user or more than 10,000 Child for any parent.

6 : Explain Skinny table.
Ans :
Salesforce creates skinny tables to contain frequently used fields and to avoid joins, and it keeps the skinny tables in sync with their source tables when the source tables are modified. To enable skinny tables, contact Salesforce.com Customer Support.
For each object table, Salesforce maintains other, separate tables at the database level for standard and custom fields. This separation ordinarily necessitates a join when a query contains both kinds of fields. A skinny table contains both kinds of fields and does not include soft-deleted records.
This table shows an Account view, a corresponding database table, and a skinny table that would speed up Account queries.
7 : What are the considerations for Skinny Table?
Ans :
  • Skinny tables can contain a maximum of 100 columns.
  • Skinny tables cannot contain fields from other objects.
  • Skinny tables are not copied to sandbox organizations. To have production skinny tables activated in a sandbox organization, contact salesforce.com Customer Support.

8 : Which fields are automatically Indexed in Salesforce?
Ans :
  • RecordTypeId
  • Division
  • CreatedDate
  • Systemmodstamp (LastModifiedDate)
  • Name
  • Email (for contacts and leads)
  • Foreign key relationships (lookups and master-detail)
  • The unique Salesforce record ID, which is the primary key for each object

9 : Which fields cannot be added as a custom Index?
Ans :
  • multi-select picklists
  • text area (long)
  • text area (rich)
  • non-deterministic formula fields (Like any formula field using function NOW() or Today() )
  • encrypted text fields

10 : When Salesforce will use Standard Indexed fields?
Ans :
Salesforce maintains statistics table which stores information about records present in Organization. If records going to be searched is less than or equal to 30% of total records or up to 1 million records then only it makes sense to use standard Indexed fields to narrow result else total records going to be returned is more than 30% already so Salesforce will not use any indexing.

11 : When Salesforce will use Custom Indexed fields?
Ans :
Salesforce maintains statistics table which stores information about records present in Organization. If records going to be searched is less than or equal to 10% of total records or up to 333,333 records then only it makes sense to use standard Indexed fields to narrow result else total records going to be returned is more than 10% already so Salesforce will not use any indexing.

12 : What are examples of Non-deterministic Force.com formula fields?
Ans :
  • Reference other entities (i.e., fields accessible through lookup fields
  • Include other formula fields that span over other entities
  • Use dynamic date and time functions (e.g., TODAY and NOW)
  • If formula field includes
    • Owner, autonumber, divisions, or audit fields (except for CreatedDate and CreatedByID fields
    • References to fields that Force.com cannot index
    • Multi-select picklists
    • Currency fields in a multicurrency organization
    • Long text area fields
    • Binary fields (blob, file, or encrypted text)
There are few standard fields also which are considered as non-deterministic which can be found in salesforce documentations.

13 : Explain Two-Column Custom Indexes.
Ans :
Two-column custom indexes are a specialized feature of the Salesforce platform. They are useful for list views and other situations in which you want to use one field to select the records to display and a second field to sort those records.
Two-column indexes are subject to the same restrictions as single-column indexes, with one exception. Two-column indexes can have nulls in the second column by default, whereas single-column indexes cannot, unless salesforce.com Customer Support has explicitly enabled the option to include nulls.

14 : What is Defer Sharing Calculation ?
Ans :
This feature allows users to defer the processing of sharing rules until after new users, rules, and other content have been loaded. This is very useful and handy feature to speed up data loading by avoiding calculation of Sharing rules.

15 : How can we load millions of records in Salesforce within an hour ?
Ans : We can use Bulk Data loading and turn ON Parallel loading. Check this Webinar recording for loading 20 millions record in one hour.

16 : In case of parallel data loading, how to avoid record locked error ?
Ans : Record locked error can be avoided by two ways
  1. Change schema of Object : Check every Lookup field and make sure that in Lookup Option you have not selected “Don’t allow deletion of the lookup record that’s part of lookup relationship”. Because of this selection, even during insert operation system hold lock against record and other batches accessing same record fails.
  2. Order Insert operations in CSV File : In this option, Sort column containing parent record Id for Lookup fields. So all records of same parent will be loaded in same batch and parent record locked problem across batch will be resolved. Check this Webinar recording for loading 20 millions record in one hour.
  3. Other reasons of lock may be Rollup Summary, Workflow, Trigger etc

1 comment:

  1. Thanks for sharing good Information

    ETG is a Salesforce Commerce Cloud Implementation Partner. For 20 years, ETG Global Services, Inc. has successfully implemented eCommerce solutions for growth and enterprise clients.

    ReplyDelete